Reinforcement learning (RL) that autonomously explores optimal control policies has become a crucial direction for developing intelligent robots while Dynamic Movement Primitives (DMPs) serve as a powerful tool for efficiently expressing robot trajectories. This article explores an efficient integration of RL and DMP to enhance the learning efficiency and control performance of reinforcement learning in robot manipulation tasks by focusing on the forms of control actions and their smoothness. A novel approach, DDPG-DMP, is proposed to address the efficiency and feasibility issues in the current RL approaches that employ DMP to generate control actions. The proposed method naturally integrates a DMP-based policy into the actor–critic framework of the traditional RL approach Deep Deterministic Policy Gradient (DDPG) and derives the corresponding update formulas to learn the networks that properly decide the parameters of DMPs. A novel inverse controller is further introduced to adaptively learn the translation from observed states into various robot control signals through DMPs, eliminating the requirement for human prior knowledge. Evaluated on five robot arm control benchmark tasks, DDPG-DMP demonstrates significant advantages in control performance, learning efficiency, and smoothness of robot actions compared to related baselines, highlighting its potential in complex robot control applications.
Read full abstract