Research on Obstacle Avoidance Planning for UUV Based on A3C Algorithm

Hongjian Wang,Wei Gao,Shanshan He,Kai Zhang,Jingfei Ren,Zhao Wang,Lihui Deng

doi:10.3390/jmse12010063

Hongjian Wang, Wei Gao + Show 5 more

Open Access

https://doi.org/10.3390/jmse12010063

Copy DOI

Abstract

Deep reinforcement learning is an artificial intelligence technology that combines deep learning and reinforcement learning and has been widely applied in multiple fields. As a type of deep reinforcement learning algorithm, the A3C (Asynchronous Advantage Actor-Critic) algorithm can effectively utilize computer resources and improve training efficiency by synchronously training Actor-Critic in multiple threads. Inspired by the excellent performance of the A3C algorithm, this paper uses the A3C algorithm to solve the UUV (Unmanned Underwater Vehicle) collision avoidance planning problem in unknown environments. This collision avoidance planning algorithm can have the ability to plan in real-time while ensuring a shorter path length, and the output action space can meet the kinematic constraints of UUVs. In response to the problem of UUV collision avoidance planning, this paper designs the state space, action space, and reward function. The simulation results show that the A3C collision avoidance planning algorithm can guide a UUV to avoid obstacles and reach the preset target point. The path planned by this algorithm meets the heading constraints of the UUV, and the planning time is short, which can meet the requirements of real-time planning.

Full Text