Abstract
Given a family of Markov chains with a single recurrent class, we present a potential application of Schweitzer's exact formula relating the steady-state probability and fundamental matrices of any two chains in the family. We propose a new policy iteration scheme for Markov decision processes where in contrast to policy iteration, the new criterion for selecting an action ensures the maximal one-step average cost improvement. Its computational complexity and storage requirement are analysed.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have