Abstract

This paper considers the use of reinforcement learning for a multi-period, multi-item inventory control problem with a budget constraint. In the problem, we decide the order quantities of multiple items considering budget constraints so as to minimizes the total inventory cost including inventory holding cost and backlog cost. The previous literature proposed a modified Q-learning that include an optimization model in the Q-learning procedure to handle budget constrained actions, but it lacks the scalability. To address this issue, this paper proposed a two-stage method: the Q-learning learns actions without considering the budget constraint in the first stage, and an optimization model adjusts the learned actions so as to satisfy the budget constraint in the second stage. Numerical study compares the performance of the proposed two-stage method with others such as a conventional Q-learning without the budget constraint and the modified Q-learning in the literature. The numerical experiments reveal that the proposed method significantly reduces the computation time without increasing the total inventory cost.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.