Abstract

Though there are extensive works on deep reinforcement learning (DRL) for robotics, sequential trajectory generation for multiprocess robotic tasks based on DRL is yet to be explored. In this article, the multiprocess task is formulated as a Markov decision process, and a nested dual-memory deep deterministic policy gradient algorithm with dynamic criteria is proposed, to generalize the traditional trajectory planning with predefined target point into a trajectory exploration problem aiming at a target area without solving inverse kinematics. First, a dual-memory architecture with local-to-global strategy is introduced to enhance the performance. Second, a novel nested architecture is proposed to generate sequential trajectory segments successively and asynchronously for the multiprocess task. Third, a compound reward system is designed and a weight coefficient matrix is adopted to balance the position control and the orientation control based on Tait–Bryan angles. In addition, a virtual twin system is established to promote the training efficiency, where the trajectory generated in simulation can be directly applied to the real physical platform. Finally, experimental results on both simulated and real-world applications have verified the performance of the proposed approach.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.