Abstract

We consider a joint pricing and inventory control problem in which the customer’s response to selling price and the demand distribution are not known a priori. Unsatisfied demand is lost and unobserved, and the only available information for decision making is the observed sales data (also known as censored demand). Conventional approaches, such as stochastic approximation, online convex optimization, and continuum-armed bandit algorithms, cannot be employed, because neither the realized values of the profit function nor its derivatives are known. A major challenge of this problem lies in that the estimated profit function constructed from observed sales data is multimodal in price. We develop a nonparametric spline approximation–based learning algorithm. The algorithm separates the planning horizon into a disjoint exploration phase and an exploitation phase. During the exploration phase, a spline approximation of the demand-price function is constructed based on sales data, and then the corresponding surrogate optimization problem is solved on a sparse grid to obtain a pair of recommended price and target inventory level. During the exploitation phase, the algorithm implements the recommended strategies. We establish a (nearly) square-root regret rate, which (almost) matches the theoretical lower bound.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.