Abstract

This paper presents an improved deep deterministic policy gradient algorithm based on a six-DOF(six multi-degree-of- freedom) arm robot. First, we build a robot model based on the DH(Denavit-Hartenberg) parameters of the UR5 arm robot. Then, we improved the experience pool of the traditional DDPG(deep deterministic policy gradient) algorithm by adding a success experience pool and a collision experience pool. Next, the reward function is improved to increase the degree of successful reward and the penalty of collision. Finally, the training is divided into segments, the front three axes are trained first, and then the six axes. The simulation results in ROS(Robot Operating System) show that the improved DDPG algorithm can effectively solve the problem that the six-DOF arm robot moves too far in the configuration space. The trained model can reach the target area in five steps. Compared with the traditional DDPG algorithm, the improved DDPG algorithm has fewer training episodes, but achieves better results.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.