Abstract

We consider a Markov decision process with a Borel state space, bounded rewards, and a bounded transition density satisfying a simultaneous Doeblin-Doob condition. An asymptotics for the discounted value function related to the existence of stationary strong 0-discount optimal policies is extended from the case of finite action sets to the case of compact action sets and continuous in action rewards and transition densities.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call