A General Theory of MultiArmed Bandit Processes with Constrained Arm Switches

Wenqing Bao,Xianyi Wu,Xiaoqiang Cai

doi:10.1137/19m1282386

Abstract

This paper proposes a general framework of multi-armed bandit (MAB) processes by introducing a type of restrictions on the switches among arms evolving in continuous time. The Gittins index process is constructed for any single arm subject to the restrictions on switches and then the optimality of the corresponding Gittins index rule is established. The Gittins indices defined in this paper are consistent with the ones for MAB processes in continuous time, integer time, semi-Markovian setting as well as general discrete time setting, so that the new theory covers the classical models as special cases and also applies to many other situations that have not yet been touched in the literature. While the proof of the optimality of Gittins index policies benefits from ideas in the existing theory of MAB processes in continuous time, new techniques are introduced which drastically simplify the proof.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A General Theory of MultiArmed Bandit Processes with Constrained Arm Switches

Abstract

Talk to us

Similar Papers

More From: SIAM Journal on Control and Optimization

Lead the way for us

Journal: SIAM Journal on Control and Optimization	Publication Date: Jan 1, 2021
Citations: 2

Similar Papers

Multi-Armed Bandit Processes
Xiaoqiang Cai ... Xian Zhou
-
Xiaoqiang Cai, et. al.Xiaoqiang Cai ... Xian Zhou
01 Jan 2014
01 Jan 2014

Empirical Gittins index strategies with ε-explorations for multi-armed bandit problems
Xiao Li ... Xianyi Wu
Computational Statistics & Data Analysis | VOL. 180
Xiao Li, et. al.Xiao Li ... Xianyi Wu
13 Sep 2022
Computational Statistics & Data Analysis | VOL. 180

On the optimality of the Gittins index rule in multi-armed bandits with multiple plays
D.G Pandelis ... D Teneketzis
-
D.G Pandelis, et. al.D.G Pandelis ... D Teneketzis
13 Dec 1995
13 Dec 1995

Markov processes in continuous time and space
Eric Renshaw
-
Eric RenshawEric Renshaw
24 Feb 2011
24 Feb 2011

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A General Theory of MultiArmed Bandit Processes with Constrained Arm Switches

Abstract

Talk to us

Similar Papers

More From: SIAM Journal on Control and Optimization