Data driven control based on Deep Q-Network algorithm for heading control and path following of a ship in calm water and waves

Sivaraman Sivaraj,Suresh Rajendran,Lokukaluge Perera Prasad

doi:10.1016/j.oceaneng.2022.111802

Abstract

A reinforcement learning algorithm based on Deep Q-Networks (DQN) is used for the path following and heading control of a ship in calm water and waves. The rudder action of the ship is selected based on the developed DQN model. The spatial positions, linear velocities, yaw rate, Heading Error (HE) and Cross Track Error (CTE) represent the state-space, and a set of rudder angles represents the action space of the DQN model. The state space variables are in continuous space and action spaces are in discrete space. The decaying ϵ-greedy method is used for the exploration. Reward functions are modeled such that the agent will try to reduce the Cross Track Error and the Heading Error. Based on the literature available, the L7 model of a KVLCC2 tanker is used for testing the algorithm. The vessel dynamics are represented using a 3DoF maneuvering model that includes hydrodynamic, propeller, rudder and wave forces. The wave disturbances are calculated from the second-order mean drift forces . The environment is assumed to have the Markov property. The CTE and HE are calculated based on the Line of Sight (LOS) Algorithm. The effect of Pre-trained weights on different heading actions is investigated based on the exploration threshold. The DQN is trained and tested for heading control and path-following in calm water and different wave headings.

Full Text