Abstract

We consider a periodic-review single-product multi-echelon inventory problem with instantaneous replenishment. In each period, the decision-maker makes ordering decisions for all echelons. Any unsatisfied demand is back-ordered, and any excess inventory is carried to the next period. In contrast to the classic inventory literature, we assume that the information of the demand distribution is not known a priori, and the decision-maker observes demand realizations over the planning horizon. We propose a nonparametric algorithm that generates a sequence of adaptive ordering decisions based on the stochastic gradient descent method. We compare the [Formula: see text]-period cost of our algorithm to the clairvoyant, who knows the underlying demand distribution in advance, and we prove that the expected [Formula: see text]-period regret is at most [Formula: see text], matching a lower bound for this problem.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call