Given the rapid advancements in kinetic pursuit technology, this paper introduces an innovative maneuvering strategy, denoted as LSRC-TD3, which integrates line-of-sight (LOS) angle rate correction with deep reinforcement learning (DRL) for high-speed unmanned aerial vehicle (UAV) pursuit–evasion (PE) game scenarios, with the aim of effectively evading high-speed and high-dynamic pursuers. In the challenging situations of the game, where both speed and maximum available overload are at a disadvantage, the playing field of UAVs is severely compressed, and the difficulty of evasion is significantly increased, placing higher demands on the strategy and timing of maneuvering to change orbit. While considering evasion, trajectory constraint, and energy consumption, we formulated the reward function by combining “terminal” and “process” rewards, as well as “strong” and “weak” incentive guidance to reduce pre-exploration difficulty and accelerate convergence of the game network. Additionally, this paper presents a correction factor for LOS angle rate into the double-delay deterministic gradient strategy (TD3), thereby enhancing the sensitivity of high-speed UAVs to changes in LOS rate, as well as the accuracy of evasion timing, which improves the effectiveness and adaptive capability of the intelligent maneuvering strategy. The Monte Carlo simulation results demonstrate that the proposed method achieves a high level of evasion performance—integrating energy optimization with the requisite miss distance for high-speed UAVs—and accomplishes efficient evasion under highly challenging PE game scenarios.
Read full abstract