Realizing Midcourse Penetration With Deep Reinforcement Learning

Liang Jiang,Zhi-Han Li,Ying Nan

doi:10.1109/access.2021.3091605

Liang Jiang, Zhi-Han Li + Show 1 more

Open Access

https://doi.org/10.1109/access.2021.3091605

Copy DOI

Abstract

A midcourse maneuver controller is obtained using deep reinforcement learning to maintain the survivability of a ballistic missile. First, the midcourse is abstracted as a Markov decision process (MDP) with an unknown system state equation. Then, a controller formed by the Dueling Double Deep Q (D3Q) neural network is used to approximate the state-action value function of the MDP. In order to make the controller’s intelligence improved by deep reinforcement learning, the state space, action space, and instant reward function of the MDP are customized. The controller uses a real-time situation as input and outputs the ignition states of pulse motors. Offline training shows that deep reinforcement learning can achieve the optimal strategy’s convergence after approximately 65 hours. Online tests demonstrate the controller’s ability to avoid an interceptor intelligently and to account for an entry error. In scenarios with multiple random factors, the controller achieved a penetration probability of 100% and a mean re-entry error of less than 5000 m.

Highlights

Ballistic missiles have a long flight time in midcourse and a fixed trajectory
To eliminate the reentry error caused by the midcourse maneuver, Reference [8] used remaining pulse motors to regress a preset ballistic
A rapid trajectory optimization algorithm was proposed for the whole course under the condition of multiple constraints and multiple detection zones

Summary

INTRODUCTION

Ballistic missiles have a long flight time in midcourse and a fixed trajectory. various countries regard midcourse interception as the core strategy for missile defense systems [1]~[3]. Reference [7] proposed a midcourse penetration strategy using an axial impulse maneuver and provided a detailed trajectory design method. This penetration strategy does not require lateral pulse motors. The concept is to design a trajectory, before launch, that can evade the enemy's detection zone The solving of this ballistic problem is a complex nonlinear programming problem with multiple constraints and multiple stages. According to the accurate model of a penetration spacecraft and an interceptor, Reference [12] proposed a guidance law using a statedependent Riccati equation (SDRE) This approach obtained superior combat effectiveness when compared with classic differential game theory.

PROBLEM FORMULATION

ANALYSIS OF THE MIDCOURSE PENETRATION

MARKOV DECISION PROCESS WITH SYSTEM

TRAINING ALGORITHM

CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2021
Citations: 12	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Realizing Midcourse Penetration With Deep Reinforcement Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Sample effficient deep reinforcement learning for control

-

15 Dec 2019
15 Dec 2019

Less is More
S Murugesan ... J Amores
-
S Murugesan, et. al.S Murugesan ... J Amores
17 Nov 2020
17 Nov 2020

Stable training via elastic adaptive deep reinforcement learning for autonomous navigation of intelligent vehicles
Yujiao Zhao ... Xinping Yan
Communications Engineering | VOL. 3
Yujiao Zhao, et. al.Yujiao Zhao ... Xinping Yan
26 Feb 2024
Communications Engineering | VOL. 3

Deep-Reinforcement-Learning-Based Scheduling with Contiguous Resource Allocation for Next-Generation Wireless Systems
Shu Sun ... Xiaofeng Li
-
Shu Sun, et. al.Shu Sun ... Xiaofeng Li
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Realizing Midcourse Penetration With Deep Reinforcement Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access