Abstract

AbstractIn this chapter, we discuss various models on joint pricing and inventory control with online demand learning. When firms make pricing and inventory replenishment decisions, it is critical to know the underlying demand distribution. However, this information is not known in practice and needs to be learned from historical data. Different from traditional literature that assumes the demand distribution is known a priori and takes this information as model input; this chapter reviews online learning algorithms that learn the demand distribution on the fly and at the same time optimize the total reward. Theoretical performance guarantees are proved to show that regret, defined as the total reward loss of the learning algorithms over planning horizon T compared with the true optimal solution had the demand distribution been known, grows at the lowest possible rate as a function of T amongst all learning algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.