Improved Deep Deterministic Policy Gradient for Dynamic Obstacle Avoidance of Mobile Robot

Liang Yan,Gang Wang,Zhijun Li,I-Ming Chen,Xiaoshan Gao

doi:10.1109/tsmc.2022.3230666

Abstract

When a mobile robot is required to perform tasks in the unknown and complex environment, it is critical to have the ability of dynamic obstacle avoidance. However, conventional deep deterministic policy gradient (DDPG) for collision-free navigation can only perceive a fixed number of dynamic obstacles, and thus it cannot adapt to the stochastic working scenario. To overcome the limitation, an improved DDPG algorithm is proposed in this study. It is an exploration to implement the DDPG with long short-term memory (LSTM) network-based encoder to achieve dynamic obstacle avoidance for the mobile robot in the stochastic working scenario, which can encode the variable number of obstacles into a fixed-length representation. Specifically, to facilitate the LSTM network-based encoder, one safe processing rule is designed to guarantee the entire information of the observable obstacles can be represented completely. The LSTM network-based encoder takes the latest environment information of observable obstacles by employing the safe processing rule and generates the fixed length state vector. In addition, continuous state space for mobile robot and obstacles, as well as reward function and action space are designed. Both simulations and experiments are conducted and the results verify that the improved DDPG algorithm can achieve collision-free trajectory with multiple dynamic obstacles well. It helps to reduce the path distance and motion time effectively.

Full Text