Abstract
We present an Linear Programming formulation of MDPs with countable state and action spaces and no unichain assumption. This is an extension of the Hordijk and Kallenberg (1979) formulation in finite state and action spaces. We provide sufficient conditions for both existence of optimal solutions to the primal LP program and absence of duality gap. Then, existence of a (possibly randomized) average optimal policy is also guaranteed. Existence of a stationary average optimal deterministic policy is also investigated.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have