Abstract

This paper proposes a method to study unmanned aerial vehicles (UAV) maneuvering decision in air combat based on deterministic policy gradient. Aiming at the problem of decision space continuity, based on reinforcement learning theory, the Deep Deterministic Policy Gradient (DDPG) algorithm architecture is used to overcome the dimensional catastrophe caused by the discretization of decision variables and achieve air combat decisions in a continuous decision space for the problem of continuous decision space. In the design of the reward function, based on the traditional distance and angle evaluation factors, the energy function is added to improve the accuracy of the reward function for air combat situation description. Through autonomous reinforcement learning training, UAV gradually learns to acquire strategies for air combat decision making without the priori knowledge, which enables UAV to gain maneuver dominance advantage in air combat. Simulation experiments show that based on the algorithm model proposed in this paper, UAV can conduct the autonomous learning process, complete the air combat maneuver decision and obtain the dominance advantage according to the air combat maneuver decision-making environment.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call