Online Learning over a Finite Action Set with Limited Switching

Jason M Altschuler,Kunal Talwar

doi:10.1287/moor.2020.1052

Abstract

This paper studies the value of switching actions in the Prediction From Experts problem (PFE) and Adversarial Multiarmed Bandits problem (MAB). First, we revisit the well-studied and practically motivated setting of PFE with switching costs. Many algorithms achieve the minimax optimal order for both regret and switches in expectation; however, high probability guarantees are an open problem. We present the first algorithms that achieve this optimal order for both quantities with high probability. This also implies the first high probability guarantees for several other problems, and, in particular, is efficiently adaptable to online combinatorial optimization with limited switching. Next, to investigate the value of switching actions more granularly, we introduce the switching budget setting, which limits algorithms to a fixed number of (costless) switches. Using this result and several reductions, we unify previous work and completely characterize the complexity of this switching budget setting up to small polylogarithmic factors: for both PFE and MAB, for all switching budgets, and for both expectation and high probability guarantees. Interestingly, as the switching budget decreases, the minimax regret rate admits a phase transition for PFE but not for MAB. These results recover and generalize the known minimax rates for the (arbitrary) switching cost setting.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Online Learning over a Finite Action Set with Limited Switching

Abstract

Talk to us

Similar Papers

More From: Mathematics of Operations Research

Lead the way for us

Journal: Mathematics of Operations Research	Publication Date: Sep 21, 2020
Citations: 1

Similar Papers

Bandits with switching costs
Ofer Dekel ... Jian Ding
-
Ofer Dekel, et. al.Ofer Dekel ... Jian Ding
31 May 2014
31 May 2014

Optimal learning and experimentation in bandit problems
Monica Brezzi ... Tze Leung Lai
Journal of Economic Dynamics and Control | VOL. 27
Monica Brezzi, et. al.Monica Brezzi ... Tze Leung Lai
12 Aug 2002
Journal of Economic Dynamics and Control | VOL. 27

Approximation algorithms for restless bandit problems
Sudipto Guha ... Kamesh Munagala
Journal of the ACM | VOL. 58
Sudipto Guha, et. al.Sudipto Guha ... Kamesh Munagala
01 Dec 2010
Journal of the ACM | VOL. 58

Opportunistic channel access with repetition time diversity and switching cost: a block multi-armed bandit approach
Zhiqiang Qin ... Yuhua Xu
Wireless Networks | VOL. 24
Zhiqiang Qin, et. al.Zhiqiang Qin ... Yuhua Xu
21 Dec 2016
Wireless Networks | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Online Learning over a Finite Action Set with Limited Switching

Abstract

Talk to us

Similar Papers

More From: Mathematics of Operations Research