Mechanisms with learning for stochastic multi-armed bandit problems

Shweta Jain,Ganesh Ghalme,Satyanath Bhat,Y Narahari,Divya Padmanabhan

doi:10.1007/s13226-016-0186-3

Abstract

The multi-armed bandit (MAB) problem is a widely studied problem in machine learning literature in the context of online learning. In this article, our focus is on a specific class of problems namely stochastic MAB problems where the rewards are stochastic. In particular, we emphasize stochastic MAB problems with strategic agents. Dealing with strategic agents warrants the use of mechanism design principles in conjunction with online learning, and leads to non-trivial technical challenges. In this paper, we first provide three motivating problems arising from Internet advertising, crowdsourcing, and smart grids. Next, we provide an overview of stochastic MAB problems and key associated learning algorithms including upper confidence bound (UCB) based algorithms. We provide proofs of important results related to regret analysis of the above learning algorithms. Following this, we present mechanism design for stochastic MAB problems. With the classic example of sponsored search auctions as a backdrop, we bring out key insights in important issues such as regret lower bounds, exploration separated mechanisms, designing truthful mechanisms, UCB based mechanisms, and extension to multiple pull MAB problems. Finally we provide a bird's eye view of recent results in the area and present a few issues that require immediate future attention.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Mechanisms with learning for stochastic multi-armed bandit problems

Abstract

Talk to us

Similar Papers

More From: Indian Journal of Pure and Applied Mathematics

Lead the way for us

Journal: Indian Journal of Pure and Applied Mathematics	Publication Date: Jun 1, 2016
Citations: 7

Similar Papers

Adaptive Exploration in Stochastic Multi-armed Bandit Problem
Xiaofang Zhang ... Quan Liu
-
Xiaofang Zhang, et. al.Xiaofang Zhang ... Quan Liu
27 Dec 2016
27 Dec 2016

Approximation algorithms for restless bandit problems
...
-
, et. al. ...
04 Jan 2009
04 Jan 2009

Approximation Algorithms for Restless Bandit Problems
Sudipto Guha ... Kamesh Munagala
-
Sudipto Guha, et. al.Sudipto Guha ... Kamesh Munagala
04 Jan 2009
04 Jan 2009

A Multi-armed Bandit Algorithm Available in Stationary or Non-stationary Environments Using Self-organizing Maps
Nobuhito Manome ... Shuji Shinohara
-
Nobuhito Manome, et. al.Nobuhito Manome ... Shuji Shinohara
01 Jan 2019
01 Jan 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mechanisms with learning for stochastic multi-armed bandit problems

Abstract

Talk to us

Similar Papers

More From: Indian Journal of Pure and Applied Mathematics