Mean Field Equilibrium in Multi-Armed Bandit Game with Continuous Reward

Xiong Wang,Riheng Jia

doi:10.24963/ijcai.2021/429

Abstract

Mean field game facilitates analyzing multi-armed bandit (MAB) for a large number of agents by approximating their interactions with an average effect. Existing mean field models for multi-agent MAB mostly assume a binary reward function, which leads to tractable analysis but is usually not applicable in practical scenarios. In this paper, we study the mean field bandit game with a continuous reward function. Specifically, we focus on deriving the existence and uniqueness of mean field equilibrium (MFE), thereby guaranteeing the asymptotic stability of the multi-agent system. To accommodate the continuous reward function, we encode the learned reward into an agent state, which is in turn mapped to its stochastic arm playing policy and updated using realized observations. We show that the state evolution is upper semi-continuous, based on which the existence of MFE is obtained. As the Markov analysis is mainly for the case of discrete state, we transform the stochastic continuous state evolution into a deterministic ordinary differential equation (ODE). On this basis, we can characterize a contraction mapping for the ODE to ensure a unique MFE for the bandit game. Extensive evaluations validate our MFE characterization, and exhibit tight empirical regret of the MAB problem.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Mean Field Equilibrium in Multi-Armed Bandit Game with Continuous Reward

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Mean field equilibria of multi armed bandit games
Ramki Gummadi ... Jia Yuan Yu
-
Ramki Gummadi, et. al.Ramki Gummadi ... Jia Yuan Yu
01 Oct 2012
01 Oct 2012

Mean Field Equilibria of Multiarmed Bandit Games
Ramakrishna Gummadi ... Jia Yuan Yu
SSRN Electronic Journal | VOL. -
Ramakrishna Gummadi, et. al.Ramakrishna Gummadi ... Jia Yuan Yu
24 May 2012
SSRN Electronic Journal | VOL. -

Model-free Computation Method in First-order Linear Quadratic Mean Field Games
Zhenhui Xu ... Tielong Shen
IFAC PapersOnLine | VOL. 56
Zhenhui Xu, et. al.Zhenhui Xu ... Tielong Shen
01 Jan 2023
IFAC PapersOnLine | VOL. 56

Mean field equilibria of multiarmed bandit games
Ramakrishna Gummadi ... Ramesh Johari
-
Ramakrishna Gummadi, et. al.Ramakrishna Gummadi ... Ramesh Johari
04 Jun 2012
04 Jun 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mean Field Equilibrium in Multi-Armed Bandit Game with Continuous Reward

Abstract

Talk to us

Similar Papers