Abstract

In this paper discounted and average Markov decision processes with finite state space and countable action set (semi-infinite MDP for short) are discussed. Without ordinary continuity and compactness conditions, for discounted semi-infinite MDP we have shown that by exploiting the results on semi-infinite linear programming due to Tijs [20] our semi-infinite discounted MDP can be approximated by a sequence of finite discounted MDPs and even in a semi-infinite discounted MDP it is sufficient to restrict ourselves to the class of deterministic stationary strategies. For average reward case we still prove that under some conditions the supremum in the class of general strategies is equivalent to the supremum in the class of deterministic stationary strategies. A counterexample shows that these conditions can not be easily relaxed.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call