Abstract

This paper is concerned with the optimality equation for the average costs in a denumerable state semi-Markov decision model. It will be shown that under each of a number of recurrency conditions on the transition probability matrices associated with the stationary policies, the optimality equation has a bounded solution. This solution indeed yields a stationary policy which is optimal for a strong version of the average cost optimality criterion. Besides the existence of a bounded solution to the optimality equation, we will show that both the value-iteration method and the policy-iteration method can be used to determine such a solution. For the latter method we will prove that the average costs and the relative cost functions of the policies generated converge to a solution of the optimality equation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call