An accurate mathematical model is a basis for controlling and estimating the state of an Autonomous underwater vehicle (AUV) system, so how to improve its accuracy is a fundamental problem in the field of automatic control. However, AUV systems are complex, uncertain, and highly non-linear, and it is not easy to obtain through traditional modeling methods. We fit an accurate dynamic AUV model in this study using the long short-term memory (LSTM) neural network approach. As hyper-parameter values have a significant impact on LSTM performance, it is important to select the optimal combination of hyper-parameters. The present research uses the improved Q-learning reinforcement learning algorithm to achieve this aim by improving its recognition accuracy on the verification dataset. To improve the efficiency of action exploration, we improve the Q-learning algorithm and choose the optimal initial state according to the Q table in each round of learning. It can effectively avoid the ineffective exploration of the reinforcement learning agent between the poor-performing hyperparameter combinations. Finally, the experiments based on simulated or actual trial data demonstrate that the proposed model identification method can effectively predict kinematic motion data, and more importantly, the modified Q-Learning approach can optimize the network hyperparameters in the LSTM.
Read full abstract