Abstract

Autonomous underwater vehicle (AUV) is widely used in complex underwater missions such as bottom survey and data collection. Multiple AUVs can cooperatively complete tasks that single AUV cannot accomplish. Recently, multi-agent reinforcement learning (MARL) has been introduced to improve multi-AUV control in uncertain marine environments. However, it is very difficult and even unpractical to design effective and efficient reward functions for various tasks. In this paper, we implemented multi-agent generative adversarial imitation learning (MAGAIL) from expert demonstrated trajectories for formation control and obstacle avoidance of multi-AUV. In addition, decentralized training with decentralized execution framework was adopted to alleviate the communication problem in underwater environments. Moreover, to facilitate the discriminator to accurately judge the quality of AUV’s trajectory in the two tasks and increase the convergence speed, we improved upon MAGAIL by dividing the state–action pairs of expert trajectory for each AUV into two groups and updating discriminator by randomly selecting equal number of state–action pairs from both groups. Our experimental results on a simulated AUV system modeling Sailfish 210 of our lab in the Gazebo simulation environment show that MAGAIL allows control policies of multi-AUV to obtain a better performance than traditional multi-agent deep reinforcement learning from fine-tuned reward function — IPPO. Moreover, control policies trained via MAGAIL in simple tasks can generalize better to complex tasks than those trained via IPPO.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call