In many real-world scenarios, tasks involve coordinating multiple agents, such as managing robot clusters, drone swarms, and autonomous vehicles. These tasks are commonly addressed using Multi-Agent Reinforcement Learning (MARL). However, existing MARL algorithms often lack foresight regarding the number and types of agents involved, requiring agents to generalize across various task configurations. This may lead to suboptimal performance due to underestimated action values and the selection of less effective joint policies. To address these challenges, we propose a novel multi-agent deep reinforcement learning framework, called multi-agent reinforcement learning framework based on adaptive grouping dynamic topological space (GDT). GDT utilizes a group mesh topology to interconnect the local action value functions of each agent, enabling effective coordination and knowledge sharing among agents. By computing three different interpretations of action value functions, GDT overcomes monotonicity constraints and derives more effective overall action value functions. Additionally, GDT groups agents with high similarity to facilitate parameter sharing, thereby enhancing knowledge transfer and generalization across different scenarios. Furthermore, GDT introduces a strategy regularization method for optimal exploration of multiple action spaces. This method assigns each agent an independent entropy temperature during exploration, enabling agents to efficiently explore potential actions and approximate total state values. Experimental results demonstrate that our approach, termed GDT, significantly outperforms state-of-the-art algorithms on Google Research Football (GRF) and the StarCraft Multi-Agent Challenge (SMAC). Particularly in SMAC tasks, GDT achieves a success rate of nearly 100% across almost all Hard Map and Super Hard Map scenarios. Additionally, we validate the effectiveness of our algorithm on Non-monotonic Matrix Games.
Read full abstract