Abstract

Given a family of Markov chains with a single recurrent class, we present a potential application of Schweitzer's exact formula relating the steady-state probability and fundamental matrices of any two chains in the family. We propose a new policy iteration scheme for Markov decision processes where in contrast to policy iteration, the new criterion for selecting an action ensures the maximal one-step average cost improvement. Its computational complexity and storage requirement are analysed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call