Detecting an Odd Restless Markov Arm With a Trembling Hand

P N Karthik,Rajesh Sundaresan

doi:10.1109/tit.2021.3075021

Abstract

In this paper, we consider a multi-armed bandit in which each arm is a Markov process evolving on a finite state space. The state space is common across the arms, and the arms are independent of each other. The transition probability matrix of one of the arms (the odd arm) is different from the common transition probability matrix of all the other arms. A decision maker, who knows these transition probability matrices, wishes to identify the odd arm as quickly as possible, while keeping the probability of decision error small. To do so, the decision maker collects observations from the arms by pulling the arms in a sequential manner, one at each discrete time instant. However, the decision maker has a trembling hand, and the arm that is actually pulled at any given time differs, with a small probability, from the one he intended to pull. The observation at any given time is the arm that is actually pulled and its current state. The Markov processes of the unobserved arms continue to evolve. This makes the arms restless. For the above setting, we derive the first known asymptotic lower bound on the expected time required to identify the odd arm, where the asymptotics is of vanishing error probability. The continued evolution of each arm adds a new dimension to the problem, leading to a family of Markov decision problems (MDPs) on a countable state space. We then stitch together certain parameterised solutions to these MDPs and obtain a sequence of strategies whose expected times to identify the odd arm come arbitrarily close to the lower bound in the regime of vanishing error probability. Prior works dealt with independent and identically distributed (across time) arms and rested Markov arms, whereas our work deals with restless Markov arms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Detecting an Odd Restless Markov Arm With a Trembling Hand

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Information Theory

Lead the way for us

Journal: IEEE Transactions on Information Theory	Publication Date: Apr 24, 2021
Citations: 5

Similar Papers

Duality in Markov Decision Problems with Countable Action and State Spaces
John P Evans
Management Science | VOL. 15
John P EvansJohn P Evans
01 Jul 1969
Management Science | VOL. 15

Markov chain convergence: From finite to infinite
Jeffrey S Rosenthal
Stochastic Processes and their Applications | VOL. 62
Jeffrey S RosenthalJeffrey S Rosenthal
01 Mar 1996
Stochastic Processes and their Applications | VOL. 62

The multi-armed bandit, with constraints
Eric V Denardo ... Uriel G Rothblum
ACM SIGMETRICS Performance Evaluation Review | VOL. 39
Eric V Denardo, et. al.Eric V Denardo ... Uriel G Rothblum
09 Mar 2012
ACM SIGMETRICS Performance Evaluation Review | VOL. 39

Discrete-time Markov chains with two-time scales and a countable state space: limit results and queueing applications
G Yin ... Hanqin Zhang
Stochastics | VOL. 80
G Yin, et. al.G Yin ... Hanqin Zhang
01 Aug 2008
Stochastics | VOL. 80

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Detecting an Odd Restless Markov Arm With a Trembling Hand

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Information Theory