Cooperative pursuit with multiple pursuers based on Deep Minimax Q-learning

Genjiu Xu,Jianjun Ge,Zekun Duan,Mingqiang Li,Liying Wang,Mengda Ji,Zesheng Li

doi:10.1016/j.ast.2024.108919

Abstract

Cooperative pursuit with multiple pursuers is critical in swarm missions. For example, in order to capture a faster-moving evader, it becomes necessary to deploy multiple unmanned aerial vehicles to form an encirclement and approach the evader. In this paper, a cooperative pursuit strategy is developed by utilizing reinforcement learning techniques and pursuit-evasion games. Firstly, we propose a novel surrounding algorithm to form a complete encirclement based on the dynamics of pursuit-evasion games. Based on this surrounding algorithm, we employ reinforcement learning techniques to train cooperative pursuit strategies to capture the evader by surrounding and approaching the evader. In pursuit-evasion games and other non-cooperative games, it is assumed that the opponent can model the others' behaviors, adapt its adversarial strategies, and exploit the others' weaknesses. The cooperative pursuit strategy needs to consider the adversarial strategies taken by the evader and counteract them. To address this issue, we employ a Deep Minimax Q-learning algorithm to simultaneously learn the evader's adversarial strategy and the cooperative pursuit strategy that cannot be exploited by the evader. Finally, we present extensive numerical simulations to evaluate the performance of the proposed learning-based cooperative pursuit strategy in various scenarios.

Full Text