Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor-Critic with Hindsight Experience Replay.

Evan Prianto,Jung-Su Kim,Myeongseop Kim,Ji-Hun Bae,Jae-Han Park

doi:10.3390/s20205911

Abstract

Since path planning for multi-arm manipulators is a complicated high-dimensional problem, effective and fast path generation is not easy for the arbitrarily given start and goal locations of the end effector. Especially, when it comes to deep reinforcement learning-based path planning, high-dimensionality makes it difficult for existing reinforcement learning-based methods to have efficient exploration which is crucial for successful training. The recently proposed soft actor–critic (SAC) is well known to have good exploration ability due to the use of the entropy term in the objective function. Motivated by this, in this paper, a SAC-based path planning algorithm is proposed. The hindsight experience replay (HER) is also employed for sample efficiency and configuration space augmentation is used in order to deal with complicated configuration space of the multi-arms. To show the effectiveness of the proposed algorithm, both simulation and experiment results are given. By comparing with existing results, it is demonstrated that the proposed method outperforms the existing results.

Highlights

In the Industry 4.0 era, one of the important elements in the manufacturing industry for a smart factory is automation via collaboration of robot manipulators, and the manufacturing industry has been less affected by human workforce [1]
The focus of this paper is placed on devising a deep reinforcement learning-based path planning algorithm for multi-arm manipulators [26]
The results show that the proposed method finds a shorter and smoother path for most scenarios due to enhanced exploration performance by soft actor–critic (SAC), and outperforms over the existing results such as probabilistic road map (PRM) [29] and TD3

Summary

Introduction

In the Industry 4.0 era, one of the important elements in the manufacturing industry for a smart factory is automation via collaboration of robot manipulators, and the manufacturing industry has been less affected by human workforce [1]. A representative of path planning for multi-arm manipulators is the sampling-based algorithm which computes the path after building a graph using sampled points of the workspace [7]. By computing the gradient of the equation, the direction of the optimal path can be attained [18] It can be trapped in the local minimum of the potential field and fail to find the right path [19]. Note that the path planning under consideration is complicated and high-dimensional by nature Due to this reason, an effective path planning algorithm for multi-arm manipulators has to be developed. In the literature about path planning, there are already some deep learning-based approaches implemented for robot applications such as mobile manipulation [20,21], unmanned ship [22] and even for multi-mobile robot [23]. The focus of this paper is placed on devising a deep reinforcement learning-based path planning algorithm for multi-arm manipulators [26]

Methods

Results

Conclusion