Many real-world multi-agent systems require agents to cooperate with each other. However, it is challenging to generate optimal cooperative strategies (e.g., location coordination or speed coordination in the encirclement task) in partially observable environments. Information sharing and temporal experience are two main ways to alleviate partial observability. The main idea of information sharing is observation-sharing scheme which is under-utilization of the topological information, hence agents obtain incomplete environmental information, resulting in a weak information sharing ability between agents. Our idea is to combine observation information and topology information to form a more comprehensive information sharing scheme. Inspired by this, we propose a novel information sharing model in multi-agent reinforcement learning (MARL), named the Entity-Teammate Hierarchical Fusion (ET-HF) model. It comprises three modules (1) message generation, using graph neural networks (GNNs) to combine local observations and topology information to generate two types of messages; (2) message interaction, transmitting messages, and adopting a hierarchical attention fusion mechanism to integrate the above two types of messages and finish information sharing; (3) cooperative policy optimization, inputting integrated messages into the proximal policy optimization (PPO) algorithm to generate the cooperative strategy that adapts to partial observability. Through ET-HF, agents can improve their understanding of the environment and reduce the loss of information caused by partial observability to generate a superior collaborative strategy. We train the multi-agent teams in a fully decentralized framework and the empirical results show that our model outperforms the baselines in cooperative tasks such as coverage control, formation control and line control.
Read full abstract