Autonomous confrontation strategy learning evolution mechanism of unmanned system group under actual combat in the loop

Zhenhua Wang,Yan Guo,Ning Li,Hao Yuan,Shiguang Hu,Binghan Lei,Jianyu Wei

doi:10.1016/j.comcom.2023.07.006

Abstract

Confrontation of unmanned system group (USG) is an important combat pattern in future aerial combat, and autonomous confrontation strategy learning evolution is the pre-foundation of USG for actual combat application, researching multiple problems concerning the realization of USG autonomous confrontation strategy active learning evolution in high-dynamic actual combat scenario through continuous interaction with commander, an Autonomous Confrontation Strategy learning evolution mechanism of USG under Actual Combat in the Loop (ACS-ACL) was thence proposed. Select the Multi Agent Deep Deterministic Policy Gradient (MADDPG) algorithm as the baseline algorithm, introduce the Parallel Decoupling Reward Mechanism (PDRM) to make applicability improvement on MADDPG algorithm, establish the generation model of USG autonomous confrontation strategy; after generating the initial autonomous confrontation strategy, USG proactive initiation the Continuous Interaction (CI) with the commander of actual combat in the loop, and uploads the perception information of recessive battlefield situation, whereas commander makes proofreading supplement for the information of battlefield situation, and transmits them back to USG with combat intention together; USG updates replay experience pool, updates autonomous confrontation strategy according to the combat intention, and updates simultaneously interaction strategy with actual combat commander in the loop, and then establishes the autonomous confrontation strategy benign closed-loop learning evolution mechanism of USG. Assume USG execution of collaborative search moving target mission against the enemy, and a visual USG autonomous collaborative search dynamic confrontation game environment is constructed, and carry out a series of simulation validation experiments. By observation and comparison, the convergence efficiency and execution quality of autonomous confrontation strategy driven by combat intention are improved significantly, the autonomous confrontation strategy learning has the benign evolution trend, and further improves the credibility of actual combat application of autonomous confrontation strategy of USG.

Full Text