Abstract

This paper proposes a guidance strategy and a controller based on end-to-end deep reinforcement learning (DRL) to address the problem of unmanned surface vehicles (USVs) interception and obstacle avoidance. A deep deterministic policy gradient (DDPG) algorithm is introduced to generate the interception strategy, a reward function with multiple objectives is designed. The artificial potential field (APF) method is used to obtain an evasion strategy for target USVs, enabling the target USVs to perform evasive actions in reaction to the interception. To find a trade-off between the time consumption and the safety of obstacle avoidance, a multi-objective equilibrium method is proposed. An incremental proportion regulator based on prior knowledge is used to dynamically modify the reward function. Besides, a virtual-reality 3D simulator based on ROS and Gazebo is constructed to present the process of interception. To demonstrate the effectiveness of the proposed method, simulation results are presented. Compared to the original DDPG algorithm, the proposed multi-objective equilibrium method shortens the interception path while ensuring the safety of obstacle avoidance. Compared to the traditional model-based approach, the DRL-based controller shows better performance in interception missions, and the robustness of the proposed controller for dynamic obstacles are shown in multiple tests.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call