We consider a partially observable inventory system in which the inventory level can only be observed when it reaches zero, the unmet demand is lost, and replenishment orders must be decided so as to minimize the long-run average cost. This problem has an infinite-dimensional state space and is therefore difficult to analyze. We prove the existence of an average-cost optimal policy, using the so-called vanishing discount factor approach. As a main methodological contribution, we provide a way to verify the key condition -- the uniform boundedness of the relative discounted value function -- for the partially observed system. To accomplish that we construct -- for the partially observed system -- a valid policy which in a certain sense copies the actions of another policy for the process with a different initial state. To our best knowledge, this paper is the first dealing with inventory models with partially observable inventory levels under the long-run average cost criterion.
Read full abstract