A modified random network distillation algorithm and its application in USVs naval battle simulation

Jinjun Rao,Xiaoqiang Xu,Haoran Bian,Jinbo Chen,Yaxing Wang,Jingtao Lei,Wojciech Giernacki,Mei Liu

doi:10.1016/j.oceaneng.2022.112147

Abstract

Unmanned surface vessel (USV) operations will change the future form of maritime wars profoundly, and one of the critical factors for victory is the cluster intelligence of USVs. Training USVs for combat using reinforcement learning (RL) is an important research direction. Sparse reward as one of the complex problems in reinforcement learning causes sluggish and inefficient USV training. Therefore, a modified random network distillation (MRND) algorithm is proposed for the sparse reward problem. This algorithm measures the weight of internal rewards by calculating the variance of the number of training steps in each training episode to adjust internal and external rewards dynamically. Through the self-play iterative training method, our algorithm, in conjunction with the classical proximal policy optimization (PPO) algorithm, can improve USV cluster intelligence rapidly. Based on USV cluster combat training environments constructed on Unity3D and ML-Agent Toolkits platform, three types of USV cluster combat simulations are conducted to validate the algorithm, including a target pursuit combat simulation, a USV cluster maritime combat simulation, and a USV cluster base offense and defense combat simulation. Simulation experiments have shown that USV clusters trained with the MRND algorithm converge quicker, acquire more rewards in fewer steps, and exhibit a higher level of intelligence than the USV cluster trained by the comparison algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A modified random network distillation algorithm and its application in USVs naval battle simulation

Abstract

Talk to us

Similar Papers

More From: Ocean Engineering

Lead the way for us

Journal: Ocean Engineering	Publication Date: Aug 7, 2022
Citations: 9

Similar Papers

UAV/USV Cooperative Trajectory Optimization Based on Reinforcement Learning
Peng Yao ... Zhicheng Gao
-
Peng Yao, et. al.Peng Yao ... Zhicheng Gao
25 Nov 2022
25 Nov 2022

Proximal policy optimization with reciprocal velocity obstacle based collision avoidance path planning for multi-unmanned surface vehicles
Delai Xue ... Zhixiong Li
Ocean Engineering | VOL. 273
Delai Xue, et. al.Delai Xue ... Zhixiong Li
25 Feb 2023
Ocean Engineering | VOL. 273

USV Target Interception Control With Reinforcement Learning and Motion Prediction Method
Yong Liu ... Lu Dong
-
Yong Liu, et. al.Yong Liu ... Lu Dong
19 Nov 2022
19 Nov 2022

Cooperative multi-target hunting by unmanned surface vehicles based on multi-agent reinforcement learning
Jiawei Xia ... Zhong Liu
Defence Technology | VOL. 29
Jiawei Xia, et. al.Jiawei Xia ... Zhong Liu
11 Oct 2022
Defence Technology | VOL. 29

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A modified random network distillation algorithm and its application in USVs naval battle simulation

Abstract

Talk to us

Similar Papers

More From: Ocean Engineering