Abstract

AbstractAiming at addressing the problem of manoeuvring decision‐making in UAV air combat, this study establishes a one‐to‐one air combat model, defines missile attack areas, and uses the non‐deterministic policy Soft‐Actor‐Critic (SAC) algorithm in deep reinforcement learning to construct a decision model to realize the manoeuvring process. At the same time, the complexity of the proposed algorithm is calculated, and the stability of the closed‐loop system of air combat decision‐making controlled by neural network is analysed by the Lyapunov function. This study defines the UAV air combat process as a gaming process and proposes a Parallel Self‐Play training SAC algorithm (PSP‐SAC) to improve the generalisation performance of UAV control decisions. Simulation results have shown that the proposed algorithm can realize sample sharing and policy sharing in multiple combat environments and can significantly improve the generalisation ability of the model compared to independent training.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.