Sparse Reward Based Manipulator Motion Planning by Using High Speed Learning from Demonstrations

Guoyu Zuo,Jiahao Lu,Tingting Pan

doi:10.1109/robio.2018.8665328

Abstract

This paper proposed a high speed learning from demonstrations (LfD) method for sparse reward based motion planning problem of manipulator by using hindsight experience replay (HER) mechanism and deep deterministic policy gradient (DDPG) method. First, a demonstrations replay buffer and an agent exploration replay buffer are created for storing experience data, and the hindsight experience replay mechanism is subsequently used to acquire the experience data from the two replay buffers. Then, the deep deterministic policy gradient method is used to learn the experience data and finally fulfil the manipulator motion planning tasks under the sparse reward. Last, experiments on the pushing and pick-and-place tasks were conducted in the robotics environment in the gym. Results show that the training speed is increased to at least 10 times as compared to the deep deterministic policy gradient method without demonstrations data. In addition, the proposed method can effectively utilize the sparse reward, and the agent can quickly complete the task even under the low success rate of demonstrations data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Sparse Reward Based Manipulator Motion Planning by Using High Speed Learning from Demonstrations

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Design of a Path-Following Controller for Autonomous Vehicles Using an Optimized Deep Deterministic Policy Gradient Method
Ali Rizehvandi ... Shahram Azadi
International Journal of Automotive and Mechanical Engineering | VOL. 21
Ali Rizehvandi, et. al.Ali Rizehvandi ... Shahram Azadi
20 Sep 2024
International Journal of Automotive and Mechanical Engineering | VOL. 21

Machine Learning Control of an Aerial Robot Based on a Tuned Deep Deterministic Policy Gradient Method
Mohamadamin Esfandiari ... M.A Amiri Atashgah
-
Mohamadamin Esfandiari, et. al.Mohamadamin Esfandiari ... M.A Amiri Atashgah
15 Nov 2022
15 Nov 2022

Coordinated Longitudinal and Lateral Motions Control of Automated Vehicles Based on Multi-Agent Deep Reinforcement Learning for On-Ramp Merging
Wenchang Li ... Kaichong Liang
-
Wenchang Li, et. al.Wenchang Li ... Kaichong Liang
09 Apr 2024
09 Apr 2024

End-to-end deep reinforcement learning for control of an autonomous underwater robot with an undulating propulsor
Ahmad Aws ... Vladimir Soloviev
Robotics and Technical Cybernetics | VOL. 12
Ahmad Aws, et. al.Ahmad Aws ... Vladimir Soloviev
01 Mar 2024
Robotics and Technical Cybernetics | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Sparse Reward Based Manipulator Motion Planning by Using High Speed Learning from Demonstrations

Abstract

Talk to us

Similar Papers