Contextual Bandit with Adaptive Feature Extraction

Baihan Lin,Guillermo A Cecchi,Irina Rish,Djallel Bouneffouf

doi:10.1109/icdmw.2018.00136

Baihan Lin, Guillermo A Cecchi + Show 2 more

Open Access

https://doi.org/10.1109/icdmw.2018.00136

Copy DOI

Abstract

We consider an online decision making setting known as contextual bandit problem, and propose an approach for improving contextual bandit performance by using an adaptive feature extraction (representation learning) based on online clustering. Our approach starts with an off-line pre-training on unlabeled history of contexts (which can be exploited by our approach, but not by the standard contextual bandit), followed by an online selection and adaptation of encoders. Specifically, given an input sample (context), the proposed approach selects the most appropriate encoding function to extract a feature vector which becomes an input for a contextual bandit, and updates both the bandit and the encoding function based on the context and on the feedback (reward). Our experiments on a variety of datasets, and both in stationary and non-stationary environments of several kinds demonstrate clear advantages of the proposed adaptive representation learning over the standard contextual bandit based on raw input contexts.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Contextual Bandit with Adaptive Feature Extraction

Abstract

Talk to us

Similar Papers

Lead the way for us

Publication Date: Nov 1, 2018
Citations: 40	License type: other-oa

Similar Papers

Pairwise Regression with Upper Confidence Bound for Contextual Bandit with Multiple Actions
Ya-Hsuan Chang ... Hsuan-Tien Lin
-
Ya-Hsuan Chang, et. al.Ya-Hsuan Chang ... Hsuan-Tien Lin
01 Dec 2013
01 Dec 2013

Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits Under Realizability
David Simchi-Levi ... Yunzong Xu
Mathematics of Operations Research | VOL. 47
David Simchi-Levi, et. al.David Simchi-Levi ... Yunzong Xu
09 Dec 2021
Mathematics of Operations Research | VOL. 47

Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability
David Simchi-Levi ... Yunzong Xu
SSRN Electronic Journal | VOL. -
David Simchi-Levi, et. al.David Simchi-Levi ... Yunzong Xu
12 Jul 2020
SSRN Electronic Journal | VOL. -

Thompson Sampling with Time-Varying Reward for Contextual Bandits
Cairong Yan ... Yanting Zhang
-
Cairong Yan, et. al.Cairong Yan ... Yanting Zhang
01 Jan 2023
01 Jan 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Contextual Bandit with Adaptive Feature Extraction

Abstract

Talk to us

Similar Papers