Abstract
This technical note deals with the mean variance problem (known as the average variance (AV) minimization problem) for finite continuous time Markov decision processes. We first introduce a so called G-condition which is weaker than the well known ergodicity and unichain conditions and sufficient for the finiteness of the AV of a policy. Also, we present an example of a policy having infinite AV when the G-condition is not satisfied. Under the G-condition we prove that the AV criterion can be transformed into an equivalent mean (or expected) average criterion by using a martingale technique and an observation from the canonical form of a transition rate matrix, and thus the existence and calculation of an AV minimal policy over a class of mean optimal policies are obtained by a policy iteration algorithm in an finite number of iterations. As byproduct, we obtain some interesting new results about the mean average optimality.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.