Abstract

Inventory management is one of the most important components of Alibaba’s business. Traditionally, human buyers make replenishment decisions: although artificial intelligence (AI) algorithms make recommendations, human buyers can choose to ignore these recommendations and make their own decisions. The company has been exploring a new replenishment system in which algorithmic recommendations are final. The algorithms combine state-of-the-art deep reinforcement learning techniques with the framework of fictitious play. By learning the supplier’s behavior, we are able to address the important issues of lead time and fill rate on order quantity, which have been ignored in the extant literature of stochastic inventory control. We present evidence that our algorithms outperform human buyers in terms of reducing out-of-stock rates and inventory levels. More interestingly, we have seen additional benefits amid the pandemic. Over the last two years, cities in China partially and intermittently locked down to mitigate COVID-19 outbreaks. We have observed panic buying from human buyers during lockdowns, leading to the bullwhip effect. By contrast, panic buying and the bullwhip effect can be mitigated using our algorithms due to their ability to recognize changes in the supplier’s behavior during lockdowns. History: This paper has been accepted for the INFORMS Journal on Applied Analytics Special Issue—2022 Daniel H. Wagner Prize for Excellence in the Practice of Advanced Analytics and Operations Research.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call