A note on the advantage of context in Thompson sampling

Michael Byrd,Ross Darrow

doi:10.1057/s41272-021-00314-1

Abstract

Personalization has become a focal point of modern revenue management. However, it is often the case that minimal data are available to appropriately make suggestions tailored to each customer. This has led to many products making use of reinforcement learning-based algorithms to explore sets of offerings to find the best suggestions to improve conversion and revenue. Arguably the most popular of these algorithms are built on the foundation of the multi-arm bandit framework, which has shown great success across a variety of use cases. A general multi-arm bandit algorithm aims to trade-off adaptively exploring available, but under observed, recommendations, with the current known best offering. While much success has been achieved with these relatively understandable procedures, much of the airline industry is losing out on better personalized offers by ignoring the context of the transaction, as is the case in the traditional multi-arm bandit setup. Here, we explore a popular exploration heuristic, Thompson sampling, and note implementation details for multi-arm and contextual bandit variants. While the contextual bandit requires greater computational and technical complexity to include contextual features in the decision process, we illustrate the value it brings by the improvement in overall expected

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A note on the advantage of context in Thompson sampling

Abstract

Talk to us

Similar Papers

More From: Journal of Revenue and Pricing Management

Lead the way for us

Journal: Journal of Revenue and Pricing Management	Publication Date: Mar 24, 2021
Citations: 4

Similar Papers

A note on the advantage of context in Thompson sampling
Michael Byrd ... Ross Darrow
-
Michael Byrd, et. al.Michael Byrd ... Ross Darrow
01 Jan 2023
01 Jan 2023

In-depth Exploration and Implementation of Multi-Armed Bandit Models Across Diverse Fields
Jiazhen Wu
Highlights in Science, Engineering and Technology | VOL. 94
Jiazhen WuJiazhen Wu
26 Apr 2024
Highlights in Science, Engineering and Technology | VOL. 94

Thompson Sampling for Dynamic Multi-armed Bandits
Neha Gupta ... Ole-Christoffer Granmo
-
Neha Gupta, et. al.Neha Gupta ... Ole-Christoffer Granmo
01 Dec 2011
01 Dec 2011

Performance Comparison of UCB, TS, and -Greedy TS Algorithms through Simulation of Multi-Armed Bandit Machine
Zhuoran Liu
Applied and Computational Engineering | VOL. 83
Zhuoran LiuZhuoran Liu
31 Oct 2024
Applied and Computational Engineering | VOL. 83

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A note on the advantage of context in Thompson sampling

Abstract

Talk to us

Similar Papers

More From: Journal of Revenue and Pricing Management