UCAV Air Combat Maneuver Decisions Based on a Proximal Policy Optimization Algorithm with Situation Reward Shaping

Kaibiao Yang,Ri Liu,Shengde Jia,Ming Cai,Wenhan Dong

doi:10.3390/electronics11162602

Abstract

Autonomous maneuver decision by an unmanned combat air vehicle (UCAV) is a critical part of air combat that requires both flight safety and tactical maneuvering. In this paper, an unmanned combat air vehicle air combat maneuver decision method based on a proximal policy optimization algorithm (PPO) is proposed. Firstly, a motion model of the unmanned combat air vehicle and a situation assessment model of air combat was established to describe the motion situation of the unmanned combat air vehicle. An enemy maneuver policy based on a situation assessment with a greedy algorithm was also proposed for air combat confrontation, which aimed to verify the performance of the proximal policy optimization algorithm. Then, an action space based on a basic maneuver library and a state observation space of the proximal policy optimization algorithm were constructed, and a reward function with situation reward shaping was designed for accelerating the convergence rate. Finally, a simulation of air combat confrontation was carried out, which showed that the agent using the proximal policy optimization algorithm learned to combine a series of basic maneuvers, such as diving, climb and circling, into tactical maneuvers and eventually defeated the enemy. The winning rate of the proximal policy optimization algorithm reached 62%, and the corresponding losing rate was only 11%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronics	Publication Date: Aug 19, 2022
Citations: 10	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

UCAV Air Combat Maneuver Decisions Based on a Proximal Policy Optimization Algorithm with Situation Reward Shaping

Abstract

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

Maneuver Strategy Generation of UCAV for within Visual Range Air Combat Based on Multi-Agent Reinforcement Learning and Target Position Prediction
Weiren Kong ... Kai Zhang
Applied Sciences | VOL. 10
Weiren Kong, et. al.Weiren Kong ... Kai Zhang
28 Jul 2020
Applied Sciences | VOL. 10

Autonomous Maneuver Decision of UCAV Air Combat Based on Double Deep Q Network Algorithm and Stochastic Game Theory
Yuan Cao ... Zhan-Wu Li
International Journal of Aerospace Engineering | VOL. 2023
Yuan Cao, et. al.Yuan Cao ... Zhan-Wu Li
16 Jan 2023
International Journal of Aerospace Engineering | VOL. 2023

One-to-one Close Air Combat Maneuver Decision Method Based On Target Maneuver Intention Prediction
Haodong Meng ... Yunchong Feng
-
Haodong Meng, et. al.Haodong Meng ... Yunchong Feng
28 Oct 2022
28 Oct 2022

Optimal Guidance Method for UCAV in Close Free Air Combat
Yaofei Chen ... Dejian Liu
-
Yaofei Chen, et. al.Yaofei Chen ... Dejian Liu
01 Oct 2019
01 Oct 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

UCAV Air Combat Maneuver Decisions Based on a Proximal Policy Optimization Algorithm with Situation Reward Shaping

Abstract

Talk to us

Similar Papers

More From: Electronics