Tracking control of AUV via novel soft actor-critic and suboptimal demonstrations

Yue Zhang,Tianze Zhang,Yibin Li,Yinghao Zhuang

doi:10.1016/j.oceaneng.2023.116540

Yue Zhang, Tianze Zhang + Show 2 more

https://doi.org/10.1016/j.oceaneng.2023.116540

Copy DOI

Export

Save

Cite

Journal: Ocean Engineering	Publication Date: Dec 27, 2023
Citations: 2

Affiliation: Shandong University

Abstract
Full-Text
Similar Papers

Abstract

Listen

Tracking control for autonomous underwater vehicles (AUVs) faces multifaceted challenges, making the acquisition of optimal demonstrations a daunting task. The suboptimal demonstrations mean less tracking accuracy. To address the issue of learning from suboptimal demonstrations, this paper proposes a model-free reinforcement learning (RL) method. Our approach utilizes suboptimal demonstrations to obtain an initial controller, which is iteratively refined during training. Given the suboptimal characteristics, demonstrations will be removed from the replay buffer upon reaching capacity. Building upon the soft actor-critic (SAC), our approach integrates a Recurrent Neural Network (RNN) into the policy network to capture the relationship between states and actions. Moreover, we introduce logarithmic and cosine functions to the reward function for enhancing the training effectiveness. Finally, we validate the effectiveness of the proposed Initialize Controller from Demonstrations (ICfD) algorithm through simulations with two reference trajectories. We provide a definition for tracking success. The success rates of ICfD in the two reference trajectories are 95.60% and 94.05%, respectively, surpassing the state-of-the-art RL method SACfD (80.03% 90.55%). The average one-step distance errors of ICfD are 1.20 m and 0.76 m, respectively, significantly lower than the S-plane controller (9.725 m 8.325 m). Besides, we evaluate the generalization of the ICfD controller in different scenarios.

Full Text