Obstacle avoidance planning combining reinforcement learning and RRT* applied to underwater operations

Xiangyu Qiu,Yue Shen,Chen Feng

doi:10.23919/oceans44145.2021.9706105

Abstract

Obstacle avoidance planning has always been an essential technology for Autonomous Underwater Vehicles (AUV) underwater operations. In view of the complex underwater obstacle environment, planning a path that takes into account the optimal calculation efficiency and energy consumption is of great significance to underwater operations. The rapidly random tree * (RR$\text{T}^{*}$) method has been proposed in recent years to solve the path optimization problem. Compared with the original rapidly random tree (RRT), RRT * has the feature of gradual optimization of the path, reducing a lot of useless border memory storage, greatly improving the efficiency of search. Since the RR$\text{T}^{*}$ algorithm is essentially a random sampling extension, the following problems will occur: The goal is so weak that the cost of exploring ineffective areas is high. In this paper, the RRT * algorithm driven by reinforcement learning (RL-RR$\text{T}^{*}$) is used to reduce the detection cost of invalid areas. This method uses Q-Learning to optimize the random tree expansion process of RRT * , which not only maintains the random exploratory property of RRT * in unknown special environments, but also uses Q-Learning to reduce the exploration cost of invalid regions. Specifically, while satisfying the AUVs kinematics model, by setting the reward function of the extended node, the variable probability parameter of the biased target and the dynamic step function, the invalid nodes are reduced, the exploration process is accelerated, and the path planning efficiency is improved. In the simulation experiment, this method is used in two unknown special maze environments. The experimental results show the feasibility of the RL-RRT * algorithm and the advantages of efficiency and performance.

Full Text