Abstract

In this paper, a method for navigation and obstacle avoidance of unmanned surface vessel (USV) based on reinforcement learning and reward shaping is proposed. This approach uses double deep Q networks (DDQN) to make decisions based on the continuous states observed from sensors in USV. In addition, a new reward function is designed based on prior knowledge to accelerate the convergence of the algorithm and improve the performance. For training the neural networks, a simulation platform is developed, in which a 3 degree of freedom mathematical model describes USV dynamic system and two-dimension actions are required to control USV. Simulation results on the platform demonstrate the DDQN hoists USV’s capabilities of navigation and obstacle avoidance, and reward shaping technique improves the speed of convergence.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call