Reinforcement learning (RL) is a promising technique for designing a model-free controller by interacting with the environment. Several researchers have applied RL to autonomous underwater vehicles (AUVs) for motion control, such as trajectory tracking. However, the existing RL-based controller usually assumes that the unknown AUV dynamics keep invariant during the operation period, limiting its further application in the complex underwater environment. In this article, a novel meta-RL-based control scheme is proposed for trajectory tracking control of AUV in the presence of unknown and time-varying dynamics. To this end, we divide the tracking task for AUV with time-varying dynamics into multiple specific tasks with fixed time-varying dynamics, to which we apply meta-RL for training to distill the general control policy. The obtained control policy can transfer to the testing phase with high adaptability. Inspired by the line-of-sight (LOS) tracking rule, we formulate each specific task as a Markov decision process (MDP) with a well-designed state and reward function. Furthermore, a novel policy network with an attention module is proposed to extract the hidden information of AUV dynamics. The simulation environment with time-varying dynamics is established, and the simulation results reveal the effectiveness of our proposed method.
Read full abstract