Abstract

In Markov Decision Chains (MDC's) with a finite state space and finite action sets it is a well-known result, that the limits of α-discounted optimal policies, for α tending to 1, are Blackwell optimal. It was conjectured in a recent paper by Cavazos-Cadena and Lasserre, that this property of limiting policies also holds for unichain MDC's. We disprove this conjecture by constructing a non-Blackwell limiting policy in a unichain MDC with finitely many states and compact action sets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call