Abstract

Beyond-visual-range (BVR) engagement becomes more and more popular in the modern air battlefield. The key and difficulty for pilots in the fight is maneuver planning, which reflects the tactical decision-making capacity of the both sides and determinates success or failure. In this paper, we propose an intelligent maneuver planning method for BVR combat with using an improved deep Q network (DQN). First, a basic combat environment builds, which mainly includes flight motion model, relative motion model and missile attack model. Then, we create a maneuver decision framework for agent interaction with the environment. Basic perceptive variables are constructed for agents to form continuous state space. Also, considering the threat of each side missile and the constraint of airfield, the reward function is designed for agents to training. Later, we introduce a training algorithm and propose perceptional situation layers and value fitting layers to replace policy network in DQN. Based on long short-term memory (LSTM) cell, the perceptional situation layer can convert basic state to high-dimensional perception situation. The fitting layer does well in mapping action. Finally, three combat scenarios are designed for agent training and testing. Simulation result shows the agent can avoid the threat of enemy and gather own advantages to threat the target. It also proves the models and methods of agents are valid and intelligent air combat can be realized.

Highlights

  • With the development from informatization to intelligence, technological progress will hasten the revolution of war

  • long short-term memory (LSTM)-deep Q network (DQN) ALGORITHM Based on the agent decision framework in Fig. 7, we mainly introduce the deep network and training algorithm for agent

  • The effectiveness and decision-making ability of LSTM-DQN is superior to others

Read more

Summary

INTRODUCTION

With the development from informatization to intelligence, technological progress will hasten the revolution of war. Many scholars have done in-depth research on close air combat and proposed most mature algorithms We summarize these methods and divide into two categories: reactive decision-making and deductive decision-making. Deductive decision-making method, such as game theory [17,18], dynamic programming (DP) [19, 20], and Monte Carlo search [21], can give a solution considering the nstep decisions from the current state Those methods can find solutions without rule experience, but real-time effect is poor. Literature [27] improves the model and trains network for each single agent with Q-learning method, while the use of discrete state space and discrete action space make the results of air combat lack continuity. The whole LSTM-DQN algorithm does well in agent's state-situation-action mapping

AIR COMBAT ENVIRONMENT DESIGN
AIR COMBAT MODEL WITH REINFORCEMENT LEARNING
Algorithmic representation
AIR COMBAT SIMULATION
COMPARISON AND DISCUSSION
TESTING RESULTS WITH DIFFERENT METHODS
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.