Abstract
We develop the first nonparametric learning algorithm for periodic-review perishable inventory systems. In contrast to the classical perishable inventory literature, we assume that the firm does not know the demand distribution a priori and makes replenishment decisions in each period based only on the past sales (censored demand) data. It is well known that even with complete information about the demand distribution a priori, the optimal policy for this problem does not possess a simple structure. Motivated by the studies in the literature showing that base-stock policies perform near optimal in these systems, we focus on finding the best base-stock policy. We first establish a convexity result, showing that the total holding, lost sales and outdating cost is convex in the base-stock level. Then, we develop a nonparametric learning algorithm that generates a sequence of order-up-to levels whose running average cost converges to the cost of the optimal base-stock policy. We establish a square-root convergence rate of the proposed algorithm, which is the best possible. Our algorithm and analyses require a novel method for computing a valid cycle subgradient and the construction of a bridging problem, which significantly departs from previous studies. The e-companion is available at https://doi.org/10.1287/opre.2018.1724
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.