Abstract

Multiple real-world problems are naturally modeled as cooperative multi-agent systems, ranging from satellite formation to traffic monitoring. These systems require algorithms that can learn successful policies with independent agents that rely solely on local partial-observations of the environment. However, multi-agent environments are more complex, due to their partial-observability and non-stationarity from an agent’s perspective, as well as the structural credit assignment problem and the curse of dimensionality, and achieving coordination in such systems remains a complex challenge. To this end, we propose a multi-agent actor-critic algorithm called Asynchronous Advantage Actor Centralized-Critic with Communication (A3C3). A3C3 uses a centralized critic to estimate a value function, decentralized actors to approximate each agent’s policy function, and decentralized communication networks for each agent to share relevant information with its team. The critic can incorporate additional information, like the environment’s global state, when available, and optimizes the actor networks. The actor networks of an agent’s teammates optimize that agent’s communication network, such that each agent learns to output information that is relevant to the policies of others. A3C3 supports a dynamic amount of agents, noisy communication mediums, and can be horizontally scaled to shorten its learning phase. We evaluate A3C3 in two partially-observable multi-agent suites where agents benefit from communicating local information to each other. A3C3 outperforms state-of-the-art multi-agent algorithms, independent approaches, and centralized controllers with access to all agents’ observations.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.