Abstract

The multi-agent system (MAS) has always been one of the hot tasks in the distributed computing community. While with the development of reinforcement learning, the novel Multi-agent Reinforcement Learning (MARL) has gradually attracted more researchers’ attention, which aims to solve complex real-time tasks in dynamic multi-agent environment by their interaction and has been widely used in robotics, human-computer match, automatic driving and so on. Different from simple single-agent reinforcement learning, MARL faces some challenges due to the complex relationships among agents and the most influential one is the issue of credit assignment. The credit assignment often causes a substantial impediment to reward distribution, which is because the model only generates the global rewards while the own credit of each individual agent is needed during the model training phase. How to estimate and deduce the reward for each agent becomes a key issue in MARL. According to the difference of strategies, in this paper, we present an overview of the main approaches for credit assignment in MARL from three aspects, including the value-based algorithm, policy-based algorithm and mixing network-based algorithm. Also, this paper makes performance comparisons among these algorithms in different multi-agent experimental environments and finishes basic evaluation of approaches by analyzing the results of experiments. Finally, this paper summarizes the main challenges in multi-agent credit assignment (MACA) with their related solutions, current defects of algorithms about these challenges, and prospects the possible future development direction of the MACA.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call