Method of Multi-Agent Reinforcement Learning in Systems with a Variable Number of Agents

V I Petrenko,M M Gurchinsky,A S Pavlov,F B Tebueva

doi:10.17587/mau.23.507-514

Abstract

Multi-agent reinforcement learning methods are one of the newest and actively developing areas of machine learning. Among the methods of multi-agent reinforcement learning, one of the most promising is the MADDPG method, the advantage of which is the high convergence of the learning process. The disadvantage of the MADDPG method is the need to ensure the equality of the number of agents N at the training stage and the number of agents K at the functioning stage. At the same time, target multi-agent systems (MAS), such as groups of UAVs or mobile ground robots, are systems with a variable number of agents, which does not allow the use of the MADDPG method in them. To solve this problem, the article proposes an improved MADDPG method for multi-agent reinforcement learning in systems with a variable number of agents. The improved MADDPG method is based on the hypothesis that to perform its functions, an agent needs information about the state of not all other MAS agents, but only a few nearest neighbors. Based on this hypothesis, a method of hybrid joint / independent learning of MAS with a variable number of agents is proposed, which involves training a small number of agents N to ensure the functioning of an arbitrary number of agents K, K> N. The experiments have shown that the improved MADDPG method provides an efficiency of MAS functioning com-parable to the original method with varying the number of K agents at the stage of functioning within wide limits.

Full Text