Abstract

A wide variety of stochastic models have an important property known as reversibility. The analysis of reversible Markov chains is often significantly simpler than analysis of general Markov chains, particularly since there are often simple closed-form expressions for their invariant probability mass functions. In this paper we study the structure of optimal control policies for Markov decision processes with reversible dynamics. For optimal control of Markov decision processes, significant analytical and computational simplifications can arise as a result of reversibility. Here we present an analysis of a special class of reversible Markov decision processes, namely those that contain a cycle through their communication graph under any policy. These results provide encouraging evidence that similar simplifications hold for more general models, such as reversible processes without cycles and partition reversible processes. These results are demonstrated on a control-based variant of the Metropolis--Hastin...

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.