Abstract

Flocking with a swarm of unmanned aerial vehicles (UAVs) has been playing an important role in various applications. However, the complexity of developing a collision-free flocking policy for a UAV swarm increases tremendously with the swarm scale. In this paper, a novel curriculum-based multi-agent deep reinforcement learning (MADRL) approach is proposed for flocking with collision avoidance of large-scale fixed-wing UAV swarms by progressively increasing the population of UAVs, which we call PopulAtion-Specific Curriculum-based MADRL (PASCAL). Specifically, PASCAL has two core components of policy learning and knowledge transfer. For policy learning, an improved multi-agent deep deterministic policy gradient algorithm is proposed to accelerate the learning process. For knowledge transfer, an attention-based population-invariant network is designed to handle dynamic dimensional inputs, enabling the reloading and fine-tuning of the previously learned models. Finally, both numerical and semi-physical simulations demonstrate the advantages of PASCAL in terms of learning efficiency and generalization capability for different swarm scales.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call