Customization of J. Bather UCB strategy for a Gaussian multi-armed bandit

Alexander Kolnogorov,Sergey Garbar,Александр Валерианович Колногоров,Сергей Владиславович Гарбарь

doi:10.17076/mgta_2022_2_48

Alexander Kolnogorov, Sergey Garbar + Show 2 more

Open Access

https://doi.org/10.17076/mgta_2022_2_48

Copy DOI

Abstract

We consider the customization of the UCB strategy, which was first proposed by J. Bather for Bernoulli two-armed bandit, to the case of a Gaussian multi-armed bandit describing batch data processing. This optimal control problem has classical interpretation as a game with nature, in which the payment function of the player is the expected loss of total income caused by incomplete information. The goal is stated in minimax setting. For the considered game with nature, we present an invariant description of the control with a horizon equal to one, which allows to perform computations in two ways: using Monte-Carlo simulations and analytically by dynamic programming technique. For various configurations of the considered game with nature, we have found saddle points, which characterize the optimal control and the worst-case distribution of the parameters of the multi-armed bandit.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Mathematical Game Theory and Applications	Publication Date: Jan 18, 2023
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Customization of J. Bather UCB strategy for a Gaussian multi-armed bandit

Abstract

Talk to us

Similar Papers

More From: Mathematical Game Theory and Applications

Lead the way for us

Similar Papers

State Transition Tensors for Continuous-Thrust Control of Three-Body Relative Motion
Jackson Kulik ... Dmitry Savransky
Journal of Guidance, Control, and Dynamics | VOL. 46
Jackson Kulik, et. al.Jackson Kulik ... Dmitry Savransky
09 May 2023
Journal of Guidance, Control, and Dynamics | VOL. 46

Optimal Controls of 3-Dimensional Navier--Stokes Equations with State Constraints
Gengsheng Wang
SIAM Journal on Control and Optimization | VOL. 41
Gengsheng WangGengsheng Wang
01 Jan 2002
SIAM Journal on Control and Optimization | VOL. 41

A numerical method for an optimal control problem with minimum sensitivity on coefficient variation
W Wei ... K.L Teo
Applied Mathematics and Computation | VOL. 218
W Wei, et. al.W Wei ... K.L Teo
07 Jul 2011
Applied Mathematics and Computation | VOL. 218

Optimal Control Systems
Ds Naidu, ... I Kolmanovsky,
Applied Mechanics Reviews | VOL. 57
Ds Naidu,, et. al.Ds Naidu, ... I Kolmanovsky,
01 Jan 2004
Applied Mechanics Reviews | VOL. 57

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Customization of J. Bather UCB strategy for a Gaussian multi-armed bandit

Abstract

Talk to us

Similar Papers

More From: Mathematical Game Theory and Applications