This paper proposes a distributed soft formation collision avoidance strategy to address the problem of formation obstacle avoidance in complex scenario under the environment interference where unmanned surface vehicles (USVs) are restricted in observation and possess no global reference frame for localisation. A multi-task training framework for formation control is also developed based on the motion characteristics of USVs and the leader–follower method. The soft actor–critic (SAC) reinforcement learning algorithm is adapted to construct agents. With elaborated auxiliary tasks, the soft formation algorithm demonstrates strong resilience to disturbances caused by unknown environmental loads or obstacle avoidance needs with only partial information about the environmental state, thus maintaining the formation shape. The proposed trajectory communication system and policy sharing mechanism allow USVs to approach the destination and avoid collisions whilst maintaining a noncompact formation that can be broken up temporarily to improve obstacle avoidance and can be restored quickly when the obstacle is avoided. Trained agents can be applied to different formations of varying sizes. Simulation results demonstrate the feasibility and effectiveness of the proposed method in different complex sea scenarios and its robustness to unknown environmental disturbances.
Read full abstract