Graphon mean-field control for cooperative multi-agent reinforcement learning

Yuanquan Hu,Xiaoli Wei,Junji Yan,Hengxi Zhang

doi:10.1016/j.jfranklin.2023.09.002

Abstract

The marriage between mean-field theory and reinforcement learning has shown a great capacity to solve large-scale control problems with homogeneous agents. To break the homogeneity restriction of mean-field theory, a recent interest is to introduce graphon theory to the mean-field paradigm. In this paper, we propose a graphon mean-field control (GMFC) framework to approximate cooperative heterogeneous multi-agent reinforcement learning (MARL) with nonuniform interactions and heterogeneous reward functions and state transition functions among agents and show that the approximate order is of O(1N), with N the number of agents. By discretizing the graphon index of GMFC, we further introduce a smaller class of GMFC called block GMFC, which is shown to well approximate cooperative MARL in terms of the value function and the policy. Finally, we design a Proximal Policy Optimization based algorithm for block GMFC that converges to the optimal policy of cooperative MARL. Our empirical studies on several examples demonstrate that our GMFC approach is comparable with the state-of-art MARL algorithms while enjoying better scalability.

Full Text