Abstract
With the development of information technology, the degree of intelligence in air combat is increasing, and the demand for automated intelligent decision-making systems is becoming more intense. Based on the characteristics of over-the-horizon air combat, this paper constructs a super-horizon air combat training environment, which includes aircraft model modeling, air combat scene design, enemy aircraft strategy design, and reward and punishment signal design. In order to improve the efficiency of the reinforcement learning algorithm for the exploration of strategy space, this paper proposes a heuristic Q-Network method that integrates expert experience, and uses expert experience as a heuristic signal to guide the search process. At the same time, heuristic exploration and random exploration are combined. Aiming at the over-the-horizon air combat maneuver decision problem, the heuristic Q-Network method is adopted to train the neural network model in the over-the-horizon air combat training environment. Through continuous interaction with the environment, self-learning of the air combat maneuver strategy is realized. The efficiency of the heuristic Q-Network method and effectiveness of the air combat maneuver strategy are verified by simulation experiments.
Highlights
The intelligent air confrontation decision-making system can be effectively applied to automatic/autonomous simulated air confrontation, maneuver confrontation, anti-interception and various auxiliary decision-making systems of manned/unmanned aerial vehicles
In the process of over-the-horizon air confrontation, reasonable maneuver decision-making is the premise of making weapons attack, sensor use, electronic countermeasures, and other decisions
When the enemy aircraft falls within the radar detection range of the local aircraft, enemy information can be obtained more accurately; when the enemy aircraft is not in the radar detection area of the local aircraft, it is assumed that the aircraft can obtain enemy aircraft information through other sources of information in the confrontation system, but the information obtained by this method has a large error
Summary
The intelligent air confrontation decision-making system can be effectively applied to automatic/autonomous simulated air confrontation, maneuver confrontation, anti-interception and various auxiliary decision-making systems of manned/unmanned aerial vehicles. This paper mainly studies the intelligent maneuver decision-making method in this environment, based on the single-to-single air confrontation in super-horizon air confrontation. The current air confrontation decision-making methods can be divided into two main categories: non-learning strategies and self-learning strategies. Reinforcement learning is a self-learning method trial and error, self-learning strategies include: genetic algorithm [6,7], which, artificialthrough immuneconstant system [8,9], supervised interacts with the environment, gradually acquires knowledge, and improves action plans to adapt learning [10], reinforcement learning [11], etc. Reinforcement learning has goodwhich, application in decision-making such as learning is a self-learning method through constant trial and fields error, interacts robot control and automatic driving.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have