Abstract
To address the issue of decision-making for ship collision avoidance in complex encounter situations, an Autonomous Collision Avoidance Decision-making (ACAD) model based on Attention Long Short-Term Memory Twin Delayed Deep Deterministic Policy Gradient (ATL-TD3) algorithm for Unmanned Surface Vessel (USV) is proposed in this study. The novelty of this method is: (1) Incorporating LSTM and multi-head self-attention mechanism into the existing TD3 algorithm network enhances its focus on pertinent historical state information. Furthermore, the generalization ability of the trained model is further enhanced through optimization of the accumulated experience. (2) A comprehensive reward mechanism is designed, incorporating ship domain, arena, yaw, and other relevant factors, which solve the encounters defined in clauses 13–17 in Chapter 2 of COLREGs and immediate danger. The experiment involved a comparative analysis of the average reward curve and success rate during training under different environments. Additionally, 20 cases of Imazu classic encounters and real navigation data from the Suez Canal were utilized to validate the proposed algorithm. It is highlight that the ACAD model trained by the ATL-TD3 algorithm exhibits remarkable performance in the generalization verification, surpassing conventional collision avoidance methods.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have