Abstract

We consider a Markov decision process with a Borel state space, a countable action space, finite action sets, bounded rewards and a bounded transition density satisfying a simultaneous Doeblin condition. The existence of stationary strong 0-discount optimal polices is proved.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call