Abstract

Reinforcement learning (RL) is suitable for the design of unmanned surface vessels (USV) path-following controllers for its model-free and unsupervised learning. However, the USV state transition is not strictly consistent with Markov properties and the state-space, action-space, reward function strongly affect the performance of the RL-based controller. According to the dynamics and kinematic characteristics of USV, we design a new state-space to reduce the influence of large inertia and state hysteresis on the RL agent training. A comprehensive reward function is proposed to avoid the RL-based controller falling into local optimum according to the task decomposing. A dynamic threshold is used in the reward function to accelerate the training speed while ensuring tracking accuracy. Finally, the effectiveness of the proposed RL-based controller is evaluated by means of simulation and actual USV path following.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call