USV formation navigation decision-making through hybrid deep reinforcement learning using self-attention mechanism

Zhewen Cui,Wei Guan,Xianku Zhang

doi:10.1016/j.eswa.2024.124906

Abstract

To address the challenging of balancing Unmanned Surface Vessel (USV) autonomous collision avoidance and formation maintenance in uncertain environments, a formation construction and navigation decision-making strategy based on Hybrid Deep Reinforcement Learning (HDRL) is proposed in this study. The novelty of this study is that: (1) A HDRL training approach is proposed, incorporating diverse DRL for virtual leader and followers, thereby significantly enhancing the decision-making adaptability of USVs formation. (2) The multi-head attention mechanism of decentralized Critic strategy is used to enhance the attentional focus of the HDRL algorithm towards different agents, thereby effectively improving convergence speed. (3) The method employs a meticulously designed hybrid reward function and incorporates the Optimal Reciprocal Collision Avoidance (ORCA) for speed selection, thereby providing optimal speed recommendations based on the current situation. It is worth emphasizing that the method exhibits exceptional performance in terms of success rate, navigation time, average reward value, and other relevant indicators within the simulation. Finally, the method is validated through the utilization of real-time navigation data in the Panama Canal, thereby substantiating the potential engineering applicability.

Full Text