Abstract

This work deals with Markov decision processes (MDPs) with expected total rewards, discrete state spaces, and compact action sets. Within this framework, a question on the existence of optimal stationary policies, formulated by Puterman (1994, p. 326), is considered. The paper concerns the possibility of obtaining an affirmative answer when additional assumptions are imposed on the decision model. Three conditions ensuring the existence of average optimal stationary policies in finite-state MDPs are analyzed, and it is shown that only the so-called structural continuity condition is a natural sufficient assumption under which the existence of total-reward optimal stationary policies can be guaranteed. In particular, this existence result holds for unichain MDPs with finite state space, but an example is provided to show that this general conclusion does not have an extension to the denumerable state space case.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call