Deep Reinforcement Learning With Optimized Reward Functions for Robotic Trajectory Planning

Jiexin Xie,Yong Guan,Zhenzhou Shao,Jindong Tan,Yue Li

doi:10.1109/access.2019.2932257

Jiexin Xie, Yong Guan + Show 3 more

Open Access

https://doi.org/10.1109/access.2019.2932257

Copy DOI

Abstract

To improve the efficiency of deep reinforcement learning (DRL)-based methods for robotic trajectory planning in the unstructured working environment with obstacles. Different from the traditional sparse reward function, this paper presents two brand-new dense reward functions. First, the azimuth reward function is proposed to accelerate the learning process locally with a more reasonable trajectory by modeling the position and orientation constraints, which can reduce the blindness of exploration dramatically. To further improve the efficiency, a reward function at subtask-level is proposed to provide global guidance for the agent in the DRL. The subtask-level reward function is designed under the assumption that the task can be divided into several subtasks, which reduces the invalid exploration greatly. The extensive experiments show that the proposed reward functions are able to improve the convergence rate by up to three times with the state-of-the-art DRL methods. The percentage increase in convergence means is 2.25%-13.22% and the percentage decreases with respect to standard deviation by 10.8%-74.5%.

Highlights

Trajectory planning is a fundamental problem for the motion control of robot manipulator
Dense reward function give more information after each action, which can reduce the blindness of exploration of Deep Reinforcement Learning (DRL) methods in trajectory planning task
SUBTASK-LEVEL REWARD FUNCTION the proposed azimuth reward function in Section 2 is able to reduce the local blindness of exploration using DRL methods, it mainly focuses on the local exploration at one moment, lacks of global guidance in trajectory planning task

Summary

INTRODUCTION

Trajectory planning is a fundamental problem for the motion control of robot manipulator. It enables the robot manipulator to autonomously learn and plan an optimal trajectory in unstructured working environment. Xie et al.: Deep Reinforcement Learning With Optimized Reward Functions for Robotic Trajectory Planning are discrete, which cannot be applied to the tasks with continuous action spaces, just like trajectory planning of robot manipulator To solve this problem, Deep Deterministic Strategy Gradient (DDPG) [19] and Critics of Asynchronous Advantage Actors (A3C) [20] are put forward. The primary contributions of this paper are summarized as follows: 1) Considering the features of trajectory planning task and work environment, two brand-new dense reward functions are proposed. Dense reward function give more information after each action, which can reduce the blindness of exploration of DRL methods in trajectory planning task.

AZIMUTH REWARD FUNCTION

POSITION REWARD FUNCTION

ORIENTATION REWARD FUNCTION

MODELING OF AZIMUTH REWARD FUNCTION

TARGET APPROACHING REWARD FUNCTION

IMPLEMENTATION OF REWARD FUNCTION

9: Update weigh of Critic Network θ Q

EXPERIMENTAL RESULTS AND DISCUSSIONS

CONCLUSIONS

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2019
Citations: 50	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Deep Reinforcement Learning With Optimized Reward Functions for Robotic Trajectory Planning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Sample effficient deep reinforcement learning for control

-

15 Dec 2019
15 Dec 2019

Continuous Control for Autonomous Underwater Vehicle Path Following Using Deep Interactive Reinforcement Learning
Qilei Zhang ... Guangliang Li
-
Qilei Zhang, et. al.Qilei Zhang ... Guangliang Li
01 Oct 2022
01 Oct 2022

Deep Interactive Reinforcement Learning for Path Following of Autonomous Underwater Vehicle
Qilei Zhang ... Qixin Sha
IEEE Access | VOL. 8
Qilei Zhang, et. al.Qilei Zhang ... Qixin Sha
01 Jan 2020
IEEE Access | VOL. 8

Autonomous Driving Decision-making Based on the Combination of Deep Reinforcement Learning and Rule-based Controller
Jinzhu Wang Jinzhu Wang ... Huanlei Chen Huanlei Chen
-
Jinzhu Wang Jinzhu Wang, et. al.Jinzhu Wang Jinzhu Wang ... Huanlei Chen Huanlei Chen
30 Sep 2021
30 Sep 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Deep Reinforcement Learning With Optimized Reward Functions for Robotic Trajectory Planning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access