Abstract

In response to the problems of slow convergence and high randomness of the original DQN (Deep Q-Learning Network) algorithm in manipulator trajectory planning, an MPC (Model Predictive Control) algorithm advantage is fused with deep reinforcement learning, and a DQN algorithm of MPC-guided sampling is proposed. Firstly, the method reduces the number of failures during training by providing constraint control of the manipulator based on a dynamic model. Secondly, the MPC algorithm is run in different initial states. The trajectories are sampled and stored after iterative optimization by a linear Gaussian controller. These samples with high success rate improve the training speed of the neural network. Finally, a virtual simulation environment for the manipulator is built on the CoppeliaSim platform to validate the algorithm. The results show that the improved DQN algorithm improves the learning efficiency by nearly 1.5 times, whose effect is significantly better than that of the original DQN algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call