A Generalization of Robust Normal Two-Armed Bandit

Alexander V Kolnogorov

doi:10.1016/j.ifacol.2016.07.959

A Generalization of Robust Normal Two-Armed Bandit

Alexander V Kolnogorov

Open Access

https://doi.org/10.1016/j.ifacol.2016.07.959

Copy DOI

Journal: IFAC PapersOnLine

Publication Date: Jan 1, 2016

Affiliation: Yaroslav-the-Wise Novgorod State University

#Minimax Risk #Bayesian Risk + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

We consider Normal two-armed bandit problem with a priori known variances and unknown mathematical expectations of incomes in robust (minimax) setting. This setup naturally arises in group control of data processing. We show that one can solve the problem using the main theorem of the theory of games, i.e. determine minimax strategy and minimax risk as Bayesian corresponding to the worst-case prior distribution. We obtain recursive invariant Bellman-type equation for calculation appropriate Bayesian risk and Bayesian strategy. The requirement of a priori known variances of incomes may be omitted because they may be estimated at the initial stage of control.

Full Text