UCB-Type Learning Algorithms for Lost-Sales Inventory Models with Lead Times

Chengyi Lyu,Linwei Xin,Huanan Zhang

doi:10.2139/ssrn.3944354

Abstract

In this paper, we consider a classic periodic-review lost-sales inventory system with lead times, which is notoriously challenging to optimize with a wide range of real-world applications. We consider a joint learning and optimization problem in which the decision-maker does not know the demand distribution a priori and can only use past sales information (i.e., censored demand). Departing from existing learning algorithms on this learning problem (e.g., Huh et al. 2009a, Agrawal and Jia 2019, Zhang et al. 2020) that require the convexity property of the underlining system, we develop an Upper Confidence Bound (UCB)-type learning framework and show it can be applied to the learning of not only the optimal base-stock policy, but also the optimal capped base-stock policy in which the convexity property no longer holds. Compared with a classic multi-armed bandit problem, our problem has unique challenges due to the nature of the inventory system, because (1) each action has long-term impacts on future costs, and (2) the system state space is exponentially large in the lead time. Hence, our learning algorithms are not naive adoptions of the classic UCB algorithm: the design of the simulation and averaging steps is novel in our algorithms, and the confidence width in the UCB index is also different from the classic one. We prove the regrets of our learning algorithms are tight, up to a logarithmic term, in the planning horizon T. Our extensive numerical experiments suggest the proposed algorithms (almost) dominate existing learning algorithms. We also propose a practical way to select which learning algorithm to use with limited demand data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

UCB-Type Learning Algorithms for Lost-Sales Inventory Models with Lead Times

Abstract

Talk to us

Similar Papers

More From: SSRN Electronic Journal

Lead the way for us

Similar Papers

Joint Learning and Optimization for Multi-Product Pricing (and Ranking) Under a General Cascade Click Model
Xiangyu Gao ... Huanan Zhang
Management Science | VOL. 68
Xiangyu Gao, et. al.Xiangyu Gao ... Huanan Zhang
28 Jan 2022
Management Science | VOL. 68

Closing the Gap: A Learning Algorithm for Lost-Sales Inventory Systems with Lead Times
Huanan Zhang ... Xiuli Chao
Management Science | VOL. 66
Huanan Zhang, et. al.Huanan Zhang ... Xiuli Chao
26 Feb 2017
Management Science | VOL. 66

UCB-Type Learning Algorithms with Kaplan–Meier Estimator for Lost-Sales Inventory Models with Lead Times
Chengyi Lyu ... Huanan Zhang
Operations Research | VOL. 72
Chengyi Lyu, et. al.Chengyi Lyu ... Huanan Zhang
29 Feb 2024
Operations Research | VOL. 72

Bandit Methods and Selective Prediction in Deep Learning

-

29 Jun 2020
29 Jun 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

UCB-Type Learning Algorithms for Lost-Sales Inventory Models with Lead Times

Abstract

Talk to us

Similar Papers

More From: SSRN Electronic Journal