Comprehensive Ocean Information-Enabled AUV Path Planning Via Reinforcement Learning

Meng Xi,Yang Li,Hankai Liu,Jiabao Wen,Jiachen Yang,Houbing Herbert Song

doi:10.1109/jiot.2022.3155697

Abstract

The path planning of the autonomous underwater vehicle (AUV) has shown great potential in various Internet of Underwater Things (IoUT) applications. Although considerable efforts had been made, prior studies are confronted with some limitations. For one thing, existing work only uses the ocean current simulation model without introducing real ocean information, having not been supported by real data. For another, traditional path planning algorithms have strong environment dependence and lack flexibility: once the environment changes, they need to be remodeled and replanned. To overcome these challenges, this article proposes comprehensive ocean information D3QN (COID), an AUV path planning scheme exploiting comprehensive ocean information and reinforcement learning (RL), which consists of three steps. First, we introduce the comprehensive real ocean data, including weather, temperature, thermohaline, current, etc., and apply them into the regional ocean modeling system to generated reliable ocean current. Next, through well-designed state transition function and reward function, we build a 3-D grid model of ocean environment for RL. Furthermore, based on the framework of the double dueling deep <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula> network (D3QN), COID integrates local ocean current and position features to provide state input and uses priority sampling to accelerate network convergence. The performance of COID has been evaluated and proved by numerical results, which demonstrate efficient path planning and high flexibility for expansion into different ocean environments.

Full Text