Another set of conditions for Markov decision processes with average sample-path costs

Quanxin Zhu,Xianping Guo

doi:10.1016/j.jmaa.2006.02.050

Quanxin Zhu, Xianping Guo

Open Access

https://doi.org/10.1016/j.jmaa.2006.02.050

Copy DOI

Abstract

This paper deals with discrete-time Markov decision processes with average sample-path costs (ASPC) in Borel spaces. The costs may have neither upper nor lower bounds. We propose new conditions for the existence of ε-ASPC-optimal (deterministic) stationary policies in the class of all randomized history-dependent policies. Our conditions are weaker than those in the previous literature. Moreover, some sufficient conditions for the existence of ASPC optimal stationary policies are imposed on the primitive data of the model. In particular, the stochastic monotonicity condition in this paper has first been used to study the ASPC criterion. Also, the approach provided here is slightly different from the “optimality equation approach” widely used in the previous literature. On the other hand, under mild assumptions we show that average expected cost optimality and ASPC-optimality are equivalent. Finally, we use a controlled queueing system to illustrate our results.

Full Text