Abstract

In this paper, a novel deep reinforcement learning algorithm Proximal Policy Optimization (PPO) based on F-divergence is proposed to realize the motion control of unmanned surface vehicle. Aiming at the nonlinear and underactuated characteristics of unmanned surface vehicle system, the new reinforcement learning algorithm can overcome the problem that PPO algorithm falls into local optimization in the training process, and improve the diversity of algorithm exploration. Based on the Open AI simulation environment, this paper analyzes the motion law of the unmanned surface vehicle on the water surface, and establishes a three-degree of freedom kinematics and dynamics mathematical model. An improved reinforcement learning algorithm is used to design the motion controller of unmanned surface vehicle, and a compound reward is designed, which effectively improves the learning efficiency of the network. Simulation results shows the effectiveness of the improved Proximal Policy Optimization algorithm in the motion control of unmanned surface vehicle, and verify the superiority of the improved algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call