AUV Collision Avoidance Planning Method Based on Deep Deterministic Policy Gradient

Jianya Yuan,Dan Yu,Wei Gao,Mengxue Han,Hongjian Wang,Bo Zhong

doi:10.3390/jmse11122258

Abstract

Collision avoidance planning has always been a hot and important issue in the field of unmanned aircraft research. In this article, we describe an online collision avoidance planning algorithm for autonomous underwater vehicle (AUV) autonomous navigation, which relies on its own active sonar sensor to detect obstacles. The improved particle swarm optimization (I-PSO) algorithm is used to complete the path planning of the AUV under the known environment, and we use it as a benchmark to improve the fitness function and inertia weight of the algorithm. Traditional path-planning algorithms rely on accurate environment maps, where re-adapting the generated path can be highly demanding in terms of computational cost. We propose a deep reinforcement learning (DRL) algorithm based on collision avoidance tasks. The algorithm discussed in this paper takes into account the relative position of the target point and the rate of heading change from the previous timestep. Its reward function considers the target point, running time and turning angle at the same time. Compared with the LSTM structure, the Gated Recurrent Unit (GRU) network has fewer parameters, which helps to save training time. A series of simulation results show that the proposed deep deterministic policy gradient (DDPG) algorithm can obtain excellent results in simple and complex environments.

Full Text