Abstract
Dealing with the robotic continuous control problem with sparse rewards is a longstanding challenge in deep reinforcement learning (RL). While existing DRL algorithms have demonstrated great progress in learning policies from visual observations, learning effective policies still requires an impractical number of real-world data samples. Moreover, some robotic tasks are naturally specified with sparse rewards, which makes the precious data inefficient and slows down the learning process, making DRL infeasible. In addition, manually shaping reward functions is a complex work because it needs specific domain knowledge and human intervention. To alleviate the issue, this paper proposes a model-free, off-policy RL approach named TD3MHER, to learn the manipulating policy for continuous robotic tasks with sparse rewards. To be specific, TD3MHER utilizes Twin Delayed Deep Deterministic policy gradient algorithm (TD3) and Model-driven Hindsight Experience Replay (MHER) to achieve highly sample-efficient training property. Because while the agent is learning the policy, TD3MHER could also help it to learn the potation physical model of the robot which is helpful to solve the task, and it does not necessitate any novel robot-environment interactions. The performance of TD3MHER is assessed on a simulated robotic task using a 7-DOF manipulator to compare the proposed technique to a previous DRL algorithm and to verify the usefulness of our method. Results of the experiments on simulated robotic task show that the proposed approach is capable of successfully utilizing previously store samples with sparse rewards, and obtain a faster learning speed.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.