Dense Reward Research Articles

Due to the complex internal working process of circulating cooling water systems, most traditional control methods struggle to achieve stable and precise control. Therefore, this paper presents a novel adaptive control structure for the Twin Delayed Deep Deterministic Policy Gradient algorithm, which is based on a reference trajectory model (TD3-RTM). The structure is based on the Markov decision process of the recirculating cooling water system. Initially, the TD3 algorithm is employed to construct a deep reinforcement learning agent. Subsequently, a state space is selected, and a dense reward function is designed, considering the multivariable characteristics of the recirculating cooling water system. The agent updates its network based on different reward values obtained through interactions with the system, thereby gradually aligning the action values with the optimal policy. The TD3-RTM method introduces a reference trajectory model to accelerate the convergence speed of the agent and reduce oscillations and instability in the control system. Subsequently, simulation experiments were conducted in MATLAB/Simulink. The results show that compared to PID, fuzzy PID, DDPG and TD3, the TD3-RTM method improved the transient time in the flow loop by 6.09s, 5.29s, 0.57s, and 0.77s, respectively, and the Integral of Absolute Error(IAE) indexes decreased by 710.54, 335.1, 135.97, and 89.96, respectively, and the transient time in the temperature loop improved by 25.84s, 13.65s, 15.05s, and 0.81s, and the IAE metrics were reduced by 143.9, 59.13, 31.79, and 1.77, respectively. In addition, the overshooting of the TD3-RTM method in the flow loop was reduced by 17.64, 7.79, and 1.29 per cent, respectively, in comparison with the PID, the fuzzy PID, and the TD3.

PurposeAutonomous navigation of catheters and guidewires can enhance endovascular surgery safety and efficacy, reducing procedure times and operator radiation exposure. Integrating tele-operated robotics could widen access to time-sensitive emergency procedures like mechanical thrombectomy (MT). Reinforcement learning (RL) shows potential in endovascular navigation, yet its application encounters challenges without a reward signal. This study explores the viability of autonomous guidewire navigation in MT vasculature using inverse reinforcement learning (IRL) to leverage expert demonstrations.MethodsEmploying the Simulation Open Framework Architecture (SOFA), this study established a simulation-based training and evaluation environment for MT navigation. We used IRL to infer reward functions from expert behaviour when navigating a guidewire and catheter. We utilized the soft actor-critic algorithm to train models with various reward functions and compared their performance in silico.ResultsWe demonstrated feasibility of navigation using IRL. When evaluating single- versus dual-device (i.e. guidewire versus catheter and guidewire) tracking, both methods achieved high success rates of 95% and 96%, respectively. Dual tracking, however, utilized both devices mimicking an expert. A success rate of 100% and procedure time of 22.6 s were obtained when training with a reward function obtained through ‘reward shaping’. This outperformed a dense reward function (96%, 24.9 s) and an IRL-derived reward function (48%, 59.2 s).ConclusionsWe have contributed to the advancement of autonomous endovascular intervention navigation, particularly MT, by effectively employing IRL based on demonstrator expertise. The results underscore the potential of using reward shaping to efficiently train models, offering a promising avenue for enhancing the accessibility and precision of MT procedures. We envisage that future research can extend our methodology to diverse anatomical structures to enhance generalizability.

Dense Reward Research Articles

Related Topics

Articles published on Dense Reward

Impulsive maneuver strategy for multi-agent orbital pursuit-evasion game under sparse rewards

A Deep Reinforcement Learning Method Based on a Transformer Model for the Flexible Job Shop Scheduling Problem

Trajectory tracking control based on deep reinforcement learning and ensemble random network distillation for robotic manipulator

A train trajectory optimization method based on the safety reinforcement learning with a relaxed dynamic reward

Human-Inspired Meta-Reinforcement Learning Using Bayesian Knowledge and Enhanced Deep Q-Network

Adaptive control for circulating cooling water system using deep reinforcement learning.

Deep Model-Based Reinforcement Learning for Predictive Control of Robotic Systems with Dense and Sparse Rewards

Autonomous navigation of catheters and guidewires in mechanical thrombectomy using inverse reinforcement learning

Optimization of 2D Irregular Packing: Deep Reinforcement Learning with Dense Reward

Unveiling the Significance of Toddler-Inspired Reward Transition in Goal-Oriented Reinforcement Learning

Learning strategies for underwater robot autonomous manipulation control

Enhancing variational quantum state diagonalization using reinforcement learning techniques

Performance Analysis of Different Reward Functions in Reinforcement Learning for the Scheduling of Modular Automotive Production Systems

Informative Trajectory Planning Using Reinforcement Learning for Minimum-Time Exploration of Spatiotemporal Fields.

Decomposing user-defined tasks in a reinforcement learning setup using TextWorld.

DisTop: Discovering a Topological Representation to Learn Diverse and Rewarding Skills

Reinforcement learning to achieve real-time control of triple inverted pendulum

A distributed multi-vehicle pursuit scheme: generative multi-adversarial reinforcement learning

Bi-Dueling DQN Enhanced Two-Stage Scheduling for Augmented Surveillance in Smart EMS

A partially observable multi-ship collision avoidance decision-making model based on deep reinforcement learning

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Dense Reward Research Articles

Related Topics

Articles published on Dense Reward

Impulsive maneuver strategy for multi-agent orbital pursuit-evasion game under sparse rewards

A Deep Reinforcement Learning Method Based on a Transformer Model for the Flexible Job Shop Scheduling Problem

Trajectory tracking control based on deep reinforcement learning and ensemble random network distillation for robotic manipulator

A train trajectory optimization method based on the safety reinforcement learning with a relaxed dynamic reward

Human-Inspired Meta-Reinforcement Learning Using Bayesian Knowledge and Enhanced Deep Q-Network

Adaptive control for circulating cooling water system using deep reinforcement learning.

Deep Model-Based Reinforcement Learning for Predictive Control of Robotic Systems with Dense and Sparse Rewards

Autonomous navigation of catheters and guidewires in mechanical thrombectomy using inverse reinforcement learning

Optimization of 2D Irregular Packing: Deep Reinforcement Learning with Dense Reward

Unveiling the Significance of Toddler-Inspired Reward Transition in Goal-Oriented Reinforcement Learning

Learning strategies for underwater robot autonomous manipulation control

Enhancing variational quantum state diagonalization using reinforcement learning techniques

Performance Analysis of Different Reward Functions in Reinforcement Learning for the Scheduling of Modular Automotive Production Systems

Informative Trajectory Planning Using Reinforcement Learning for Minimum-Time Exploration of Spatiotemporal Fields.

Decomposing user-defined tasks in a reinforcement learning setup using TextWorld.

DisTop: Discovering a Topological Representation to Learn Diverse and Rewarding Skills

Reinforcement learning to achieve real-time control of triple inverted pendulum

A distributed multi-vehicle pursuit scheme: generative multi-adversarial reinforcement learning

Bi-Dueling DQN Enhanced Two-Stage Scheduling for Augmented Surveillance in Smart EMS

A partially observable multi-ship collision avoidance decision-making model based on deep reinforcement learning