Abstract
In Markov Decision Chains (MDC's) with a finite state space and finite action sets it is a well-known result, that the limits of α-discounted optimal policies, for α tending to 1, are Blackwell optimal. It was conjectured in a recent paper by Cavazos-Cadena and Lasserre, that this property of limiting policies also holds for unichain MDC's. We disprove this conjecture by constructing a non-Blackwell limiting policy in a unichain MDC with finitely many states and compact action sets.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have