Online Network Revenue Management Using Thompson Sampling

Kris Johnson,David Simchi-Levi,He Wang

doi:10.2139/ssrn.2588730

Kris Johnson, David Simchi-Levi + Show 1 more

Open Access

https://doi.org/10.2139/ssrn.2588730

Copy DOI

Abstract

We consider a network revenue management problem where an online retailer aims to maximize revenue from multiple products with limited inventory constraints. As common in practice, the retailer does not know the consumer's purchase probability at each price and must learn the mean demand from sales data. We propose an efficient and effective dynamic pricing algorithm, which builds upon the Thompson sampling algorithm used for multi-armed bandit problems by incorporating inventory constraints into the model and algorithm. Our algorithm proves to have both strong theoretical performance guarantees as well as promising numerical performance results when compared to other algorithms developed for the same setting. More broadly, our paper contributes to the literature on the multi-armed bandit problem with resource constraints, since our algorithm applies directly to this setting when the inventory constraint is interpreted as general resource constraints.

Full Text