Reinforcement Learning for Mean-Field Game

Mridul Agarwal,Nilay Tiwari,Arnob Ghosh,Vaneet Aggarwal

doi:10.3390/a15030073

Abstract

Stochastic games provide a framework for interactions among multiple agents and enable a myriad of applications. In these games, agents decide on actions simultaneously. After taking an action, the state of every agent updates to the next state, and each agent receives a reward. However, finding an equilibrium (if exists) in this game is often difficult when the number of agents becomes large. This paper focuses on finding a mean-field equilibrium (MFE) in an action-coupled stochastic game setting in an episodic framework. It is assumed that an agent can approximate the impact of the other agents’ by the empirical distribution of the mean of the actions. All agents know the action distribution and employ lower-myopic best response dynamics to choose the optimal oblivious strategy. This paper proposes a posterior sampling-based approach for reinforcement learning in the mean-field game, where each agent samples a transition probability from the previous transitions. We show that the policy and action distributions converge to the optimal oblivious strategy and the limiting distribution, respectively, which constitute an MFE.

Highlights

We live in a world where multiple agents interact repeatedly in a common environment
Learning in a Multi-agent reinforcement learning (MARL) is fundamentally different from the traditional single-agent reinforcement learning (RL) problem since agents interact with the environment and with each other
The optimal oblivious strategy obtained from Algorithm 1 and the limiting action distribution constitute a mean-field equilibrium and the value function obtained from the algorithm converges to the optimal value function of the true distribution

Summary

Motivation

We live in a world where multiple agents interact repeatedly in a common environment. Mean-field game drastically reduces the complexity, since an agent only needs to consider the empirical distribution of the actions played by other agents. Such mean-field games exist in several domains. If the number of agents is large, the game can be modeled as the mean-field game as the average investment made per agent impacts the decision of an agent. Another example of a mean-field game is the demand response price in the smart grid [8,9].

Contribution

Related Literature

Multi-Player Stochastic Game

Mean-Field Game

Value Function, Q Function and Policy

Stationary Mean-Field Equilibrium

Proposed Algorithm

15: Return

Convergence Result

Conditions for a Strategy to Be a MFE

Conditions of Lemma 1 Are Met for Any Optimal Oblivious Strategy

Sampling Does Not Lead to a Gap for Expected Value Function

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Algorithms	Publication Date: Feb 22, 2022
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Reinforcement Learning for Mean-Field Game

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms

Lead the way for us

Similar Papers

Shaping Large Population Agent Behaviors Through Entropy-Regularized Mean-Field Games
Yue Guan ... Panagiotis Tsiotras
-
Yue Guan, et. al.Yue Guan ... Panagiotis Tsiotras
08 Jun 2022
08 Jun 2022

Decentralized Optimal Tracking Control for Large-scale Multi-Agent Systems under Complex Environment: A Constrained Mean Field Game with Reinforcement Learning Approach
Zejian Zhou ... Hao Xu
-
Zejian Zhou, et. al.Zejian Zhou ... Hao Xu
09 Aug 2021
09 Aug 2021

Mean Field Games in Nudge Systems for Societal Networks
Jian Li ... Xinbo Geng
ACM Transactions on Modeling and Performance Evaluation of Computing Systems | VOL. 3
Jian Li, et. al.Jian Li ... Xinbo Geng
31 Aug 2018
ACM Transactions on Modeling and Performance Evaluation of Computing Systems | VOL. 3

Are Mean-field Games the Limits of Finite Stochastic Games?
Josu Doncel ... Nicolas Gast
ACM SIGMETRICS Performance Evaluation Review | VOL. 44
Josu Doncel, et. al.Josu Doncel ... Nicolas Gast
29 Sep 2016
ACM SIGMETRICS Performance Evaluation Review | VOL. 44

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Reinforcement Learning for Mean-Field Game

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms