Abstract
Beyond-visual-range (BVR) engagement becomes more and more popular in the modern air battlefield. The key and difficulty for pilots in the fight is maneuver planning, which reflects the tactical decision-making capacity of the both sides and determinates success or failure. In this paper, we propose an intelligent maneuver planning method for BVR combat with using an improved deep Q network (DQN). First, a basic combat environment builds, which mainly includes flight motion model, relative motion model and missile attack model. Then, we create a maneuver decision framework for agent interaction with the environment. Basic perceptive variables are constructed for agents to form continuous state space. Also, considering the threat of each side missile and the constraint of airfield, the reward function is designed for agents to training. Later, we introduce a training algorithm and propose perceptional situation layers and value fitting layers to replace policy network in DQN. Based on long short-term memory (LSTM) cell, the perceptional situation layer can convert basic state to high-dimensional perception situation. The fitting layer does well in mapping action. Finally, three combat scenarios are designed for agent training and testing. Simulation result shows the agent can avoid the threat of enemy and gather own advantages to threat the target. It also proves the models and methods of agents are valid and intelligent air combat can be realized.
Highlights
With the development from informatization to intelligence, technological progress will hasten the revolution of war
long short-term memory (LSTM)-deep Q network (DQN) ALGORITHM Based on the agent decision framework in Fig. 7, we mainly introduce the deep network and training algorithm for agent
The effectiveness and decision-making ability of LSTM-DQN is superior to others
Summary
With the development from informatization to intelligence, technological progress will hasten the revolution of war. Many scholars have done in-depth research on close air combat and proposed most mature algorithms We summarize these methods and divide into two categories: reactive decision-making and deductive decision-making. Deductive decision-making method, such as game theory [17,18], dynamic programming (DP) [19, 20], and Monte Carlo search [21], can give a solution considering the nstep decisions from the current state Those methods can find solutions without rule experience, but real-time effect is poor. Literature [27] improves the model and trains network for each single agent with Q-learning method, while the use of discrete state space and discrete action space make the results of air combat lack continuity. The whole LSTM-DQN algorithm does well in agent's state-situation-action mapping
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.