Abstract

This paper considers the use of reinforcement learning for a multi-period, multi-item inventory control problem with a budget constraint. In the problem, we decide the order quantities of multiple items considering budget constraints so as to minimizes the total inventory cost including inventory holding cost and backlog cost. The previous literature proposed a modified Q-learning that include an optimization model in the Q-learning procedure to handle budget constrained actions, but it lacks the scalability. To address this issue, this paper proposed a two-stage method: the Q-learning learns actions without considering the budget constraint in the first stage, and an optimization model adjusts the learned actions so as to satisfy the budget constraint in the second stage. Numerical study compares the performance of the proposed two-stage method with others such as a conventional Q-learning without the budget constraint and the modified Q-learning in the literature. The numerical experiments reveal that the proposed method significantly reduces the computation time without increasing the total inventory cost.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call