Abstract
We consider average reward Markov decision processes with discrete time parameter and denumerable state space. We are concerned with the following problem: Find necessary and sufficient conditions so that, for arbitrary bounded reward function, the corresponding average reward optimality equation has a bounded solution. This problem is solved for a class of systems including the case in which, under the action of any stationary policy, the state space is an irreducible positive recurrent class.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have