On the optimality equation for average cost Markov decision processes and its validity for inventory control

Eugene A Feinberg,Yan Liang

doi:10.1007/s10479-017-2561-9

Abstract

As is well known, average-cost optimality inequalities imply the existence of stationary optimal policies for Markov decision processes with average costs per unit time, and these inequalities hold under broad natural conditions. This paper provides sufficient conditions for the validity of the average-cost optimality equation for an infinite state problem with weakly continuous transition probabilities and with possibly unbounded one-step costs and noncompact action sets. These conditions also imply the convergence of sequences of discounted relative value functions to average-cost relative value functions and the continuity of average-cost relative value functions. As shown in this paper, the classic periodic-review setup-cost inventory control problem with backorders and convex holding/backlog costs satisfies these conditions. Therefore, the optimality inequality holds in the form of an equality with a continuous average-cost relative value function for this problem. In addition, the K-convexity of discounted relative value functions and their convergence to average-cost relative value functions, when the discount factor increases to 1, imply the K-convexity of average-cost relative value functions. This implies that average-cost optimal (s, S) policies for the inventory control problem can be derived from the average-cost optimality equation.

Full Text