Dynamics of Multiagent Reinforcement Learning Compared to Synchronisation Dynamics of Kuramoto Oscillators

Maksim V Kondakov,Valentina Y Guleva

doi:10.1016/j.procs.2022.10.202

Abstract

Multiagent reinforcement learning is a widespread approach to optimisation problem on systems with possibility of feedback links. Multiple intelligent agents interact with each other to enhance search of the optimal behavioural patterns. Structure of interactions was shown to affect system dynamics, in particular, on learning dynamics of intelligent agents, resulting in emergent behaviour, which make topicality of optimal topology exploration for different task formulations. To effectively solve certain problems, the ability of agents to cooperate in the frame of environments is extremely important. The main idea of coordinated reinforcement learning is that agents, when performing actions, take into account the actions of their neighbors, and this is taken into account when updating their parameters. However, due to the fact that multi-agent environments are very dynamic, a large amount of computation is required to effectively train MARL system. We believe the development of graph formation method for multiagent reinforcement learning systems is able to decrease resource consumption. In order to contribute to this problem, we explore effects of complete, star, path, and ring graphs on learning dynamics in a predator-prey environment, which is compared to synchronisation dynamics on graphs. In contrast to synchronisation case, showing dependence on average shortest paths, predator-prey system has two types of agents, which results in different patterns of influence.

Full Text