Adaptive Algorithm for Multi-Armed Bandit Problem with High-Dimensional Covariates

Wei Qian,Ching-Kang Ing,Ji Liu

doi:10.1080/01621459.2022.2152343

Abstract

This article studies an important sequential decision making problem known as the multi-armed stochastic bandit problem with covariates. Under a linear bandit framework with high-dimensional covariates, we propose a general multi-stage arm allocation algorithm that integrates both arm elimination and randomized assignment strategies. By employing a class of high-dimensional regression methods for coefficient estimation, the proposed algorithm is shown to have near optimal finite-time regret performance under a new study scope that requires neither a margin condition nor a reward gap condition for competitive arms. Based on the synergistically verified benefit of the margin, our algorithm exhibits adaptive performance that automatically adapts to the margin and gap conditions, and attains optimal regret rates simultaneously for both study scopes, without or with the margin, up to a logarithmic factor. Besides the desirable regret performance, the proposed algorithm simultaneously generates useful coefficient estimation output for competitive arms and is shown to achieve both estimation consistency and variable selection consistency. Promising empirical performance is demonstrated through extensive simulation and two real data evaluation examples. Supplementary materials for this article are available online.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Adaptive Algorithm for Multi-Armed Bandit Problem with High-Dimensional Covariates

Abstract

Talk to us

Similar Papers

More From: Journal of the American Statistical Association

Lead the way for us

Journal: Journal of the American Statistical Association	Publication Date: Nov 25, 2022
License type: cc-by

Similar Papers

Online Decision-Making with High-Dimensional Covariates
Hamsa Bastani ... Mohsen Bayati
SSRN Electronic Journal | VOL. 68
Hamsa Bastani, et. al.Hamsa Bastani ... Mohsen Bayati
18 Sep 2015
SSRN Electronic Journal | VOL. 68

An Optimal Algorithm for the Stochastic Bandits While Knowing the Near-Optimal Mean Reward.
Shangdong Yang ... Yang Gao
IEEE transactions on neural networks and learning systems | VOL. 32
Shangdong Yang, et. al.Shangdong Yang ... Yang Gao
01 May 2021
IEEE transactions on neural networks and learning systems | VOL. 32

A Multi-armed Bandit Algorithm Available in Stationary or Non-stationary Environments Using Self-organizing Maps
Nobuhito Manome ... Shuji Shinohara
-
Nobuhito Manome, et. al.Nobuhito Manome ... Shuji Shinohara
01 Jan 2019
01 Jan 2019

Adaptive Exploration in Stochastic Multi-armed Bandit Problem
Xiaofang Zhang ... Quan Liu
-
Xiaofang Zhang, et. al.Xiaofang Zhang ... Quan Liu
27 Dec 2016
27 Dec 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Adaptive Algorithm for Multi-Armed Bandit Problem with High-Dimensional Covariates

Abstract

Talk to us

Similar Papers

More From: Journal of the American Statistical Association