Abstract

Recently, a lot of works have been devoted to researching how agents can learn efficient cooperation in multiagent systems. However, it still remains challenging in large-scale multiagent systems (MASs) due to the complex dynamics between the agents and environment and the dimension explosion of state-action space. In this paper, we propose a novel MultiAgent Automatic Curriculum Learning method (MA-ACL) to solve learning problems of large-scale cooperative MASs by beginning from learning on a multiagent scenario with a few agents and automatically progressively increasing the number of agents. An evaluation mechanism based on self-supervised learning is innovatively designed to automatically generate appropriate curricula with a progressively increasing number of agents. Moreover, since the observation state dimension of agents varies across curricula and the learned policy knowledge needs to be effectively encoded, we design a new Distributed Transferable Relation-modeling Policy network structure (DTRP) to handle the dynamic size of the network input and model relational knowledge between agents and their surrounding environment. Simulation results show that the proposed MA-ACL using DTRP can significantly improve the performance of large-scale multiagent learning compared with manual or non curriculum learning methods, and DTRP greatly boosts the performance of MA-ACL.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call