Abstract

This paper presents a recurrent condition on Markov decision processes with a countable state space and bounded rewards. The condition is sufficient for the existence of a Blackwell optimal stationary policy, having the Laurent series expansion with continuous coefficients. It is so relaxed that the Markov chain corresponding to a stationary policy may have countably many periodic recurrent classes. Our method finds the deviation matrix in an explicit form.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call