Abstract

With the development of information technology, the degree of intelligence in air combat is increasing, and the demand for automated intelligent decision-making systems is becoming more intense. Based on the characteristics of over-the-horizon air combat, this paper constructs a super-horizon air combat training environment, which includes aircraft model modeling, air combat scene design, enemy aircraft strategy design, and reward and punishment signal design. In order to improve the efficiency of the reinforcement learning algorithm for the exploration of strategy space, this paper proposes a heuristic Q-Network method that integrates expert experience, and uses expert experience as a heuristic signal to guide the search process. At the same time, heuristic exploration and random exploration are combined. Aiming at the over-the-horizon air combat maneuver decision problem, the heuristic Q-Network method is adopted to train the neural network model in the over-the-horizon air combat training environment. Through continuous interaction with the environment, self-learning of the air combat maneuver strategy is realized. The efficiency of the heuristic Q-Network method and effectiveness of the air combat maneuver strategy are verified by simulation experiments.

Highlights

  • The intelligent air confrontation decision-making system can be effectively applied to automatic/autonomous simulated air confrontation, maneuver confrontation, anti-interception and various auxiliary decision-making systems of manned/unmanned aerial vehicles

  • In the process of over-the-horizon air confrontation, reasonable maneuver decision-making is the premise of making weapons attack, sensor use, electronic countermeasures, and other decisions

  • When the enemy aircraft falls within the radar detection range of the local aircraft, enemy information can be obtained more accurately; when the enemy aircraft is not in the radar detection area of the local aircraft, it is assumed that the aircraft can obtain enemy aircraft information through other sources of information in the confrontation system, but the information obtained by this method has a large error

Read more

Summary

Introduction

The intelligent air confrontation decision-making system can be effectively applied to automatic/autonomous simulated air confrontation, maneuver confrontation, anti-interception and various auxiliary decision-making systems of manned/unmanned aerial vehicles. This paper mainly studies the intelligent maneuver decision-making method in this environment, based on the single-to-single air confrontation in super-horizon air confrontation. The current air confrontation decision-making methods can be divided into two main categories: non-learning strategies and self-learning strategies. Reinforcement learning is a self-learning method trial and error, self-learning strategies include: genetic algorithm [6,7], which, artificialthrough immuneconstant system [8,9], supervised interacts with the environment, gradually acquires knowledge, and improves action plans to adapt learning [10], reinforcement learning [11], etc. Reinforcement learning has goodwhich, application in decision-making such as learning is a self-learning method through constant trial and fields error, interacts robot control and automatic driving.

Learning Training Scene Design
Enemy Strategy Design
Reward
Detection Capability
Azimuth factor
Attack Threat
Reward and Punishment Signal Synthesis
Markov Decision Process Modeling
Air Confrontation State Space
Maneuvering Decision Action Space
Heuristic Q-Network at the current moment
Air Confrontation Strategy Learning
Result
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call