Autonomous Obstacle Avoidance in Crowded Ocean Environment Based on COLREGs and POND

Xiao Peng,Wangyuan Zhao,Yiming Zhao,Guihua Xia,Fenglei Han

doi:10.3390/jmse11071320

Xiao Peng, Wangyuan Zhao + Show 3 more

Open Access

https://doi.org/10.3390/jmse11071320

Copy DOI

Abstract

In crowded waters with unknown obstacle motion information, traditional methods often fail to ensure safe and autonomous collision avoidance. To address the challenges of information acquisition and decision delay, this study proposes an optimized autonomous navigation strategy that combines deep reinforcement learning with internal and external rewards. By incorporating random network distillation (RND) with proximal policy optimization (PPO), the interest of autonomous ships in exploring unknown environments is enhanced. Additionally, the proposed approach enables the autonomous generation of intrinsic reward signals for actions. For multi-ship collision avoidance scenarios, an environmental reward is designed based on the International Regulations for Preventing Collision at Sea (COLREGs). This reward system categorizes dynamic obstacles into four collision avoidance situations. The experimental results demonstrate that the proposed algorithm outperforms the popular PPO algorithm by achieving more efficient and safe collision avoidance decision-making in crowded ocean environments with unknown motion information. This research provides a theoretical foundation and serves as a methodological reference for the route deployment of autonomous ships.

Full Text