Abstract

An improved TD3 algorithm is studied for the low success rate and slow learning speed of TD3(Twin Delayed Deep Deterministic Policy Gradients) algorithm in mobile robot path planning. Prioritized experience replay is added and dynamic delay update strategy is designed which can reduce the impact of value estimation errors to improve the success rate of path planning, and reduce training time. Based on the ROS melodic operating system and the Gazebo simulation software, the Turtlebot3 robot model and simulation experimental environment are established. The effectiveness of the improved TD3 algorithm is verified by comparative experiments. For the path planning task of the mobile robot, the simulation results show that compared with TD3 algorithm, the success rate of the improved TD3 algorithm is increased by 15.6%, and the training navigation time is shortened by 20%. It indicates that the improved TD3 algorithm achieves better performance in the path planning of mobile robot with continuous action space.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call