Proximal policy optimization guidance algorithm for intercepting near-space maneuvering targets

Wenxue Chen,Changsheng Gao,Wuxing Jing

doi:10.1016/j.ast.2022.108031

Abstract

This paper studies a novel guidance framework of the vehicle against a high-speed and maneuvering target based on deep reinforcement learning (DRL) considering the energy consumption, autopilot lag dynamics, and input saturation, which can effectively cope with the high flight-path angle error flight phase and various uncertainties. The guidance framework proposes an end-to-end mapping transformation between the guidance command and observation states consisting of line-of-sight (LOS) angle, relative distance, and their rate measured by the seeker. At the same time, the observability of the LOS angle and relative distance is included in constructing the reward function. Besides, the relative engagement kinematic model of the interceptor-target is established and combined with the PPO guidance algorithm, jointly described as a Markov decision process (MDP). Notably, the guidance framework is optimized using the improved proximal policy optimization (PPO) algorithm and demonstrated in a simulated terminal phase in the near-space. Specifically, the PPO guidance algorithm is structured by the policy (actor) neural network and the critic neural network, and both are standard fully-connected neural networks. Subsequently, observation states and rewards are fully collected and applied by introducing the experience replay method. Also, the exponential decay learning rate method, mini-batch stochastic gradient ascent (SGA) method, zero-score standardization, and Adam optimizer are proposed to train the reinforcement learning algorithm more efficiently. Moreover, the proposed guidance framework has an excellent generalization capability and guarantees the implementation of fixed and stochastic engagement scenarios, which means that the interceptor can realize the unlearned practical combat scenarios. The robust capacity is indicated and validated using Monte Carlo simulation under various uncertainties. Moreover, the DRL guidance framework can satisfy the onboard application requirement.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Proximal policy optimization guidance algorithm for intercepting near-space maneuvering targets

Abstract

Talk to us

Similar Papers

More From: Aerospace Science and Technology

Lead the way for us

Journal: Aerospace Science and Technology	Publication Date: Nov 25, 2022
Citations: 16

Similar Papers

ACADIA: Efficient and Robust Adversarial Attacks Against Deep Reinforcement Learning
Haider Ali ... Ananthram Swami
-
Haider Ali, et. al.Haider Ali ... Ananthram Swami
03 Oct 2022
03 Oct 2022

Reducing Impact of Constant Power Loads on DC Energy Systems by Artificial Intelligence
Meysam Gheisarnejad ... Mohammad-Hassan Khooban
IEEE Transactions on Circuits and Systems II: Express Briefs | VOL. 69
Meysam Gheisarnejad, et. al.Meysam Gheisarnejad ... Mohammad-Hassan Khooban
01 Dec 2022
IEEE Transactions on Circuits and Systems II: Express Briefs | VOL. 69

Adaptive Storage Optimization Scheme for Blockchain-IIoT Applications Using Deep Reinforcement Learning
Nana Kwadwo Akrasi-Mensah ... Eliel Keelson
IEEE Access | VOL. 11
Nana Kwadwo Akrasi-Mensah, et. al.Nana Kwadwo Akrasi-Mensah ... Eliel Keelson
01 Jan 2023
IEEE Access | VOL. 11

Intelligent design of steel–concrete composite beams based on deep reinforcement learning
Chen-Hao Lin ... Gen-Shu Tong
Structures | VOL. 70
Chen-Hao Lin, et. al.Chen-Hao Lin ... Gen-Shu Tong
06 Nov 2024
Structures | VOL. 70

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Proximal policy optimization guidance algorithm for intercepting near-space maneuvering targets

Abstract

Talk to us

Similar Papers

More From: Aerospace Science and Technology