Abstract
We consider a Markov decision process with a Borel state space, a countable action space, finite action sets, bounded rewards and a bounded transition density satisfying a simultaneous Doeblin condition. The existence of stationary strong 0-discount optimal polices is proved.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have