Abstract
We study discrete-time Markov decision processes with denumerable state space and bounded costs per stage. It is assumed that the decision maker exhibits a constant sensitivity to risk, and that the performance of a control policy is measured by a (long-run) risk-sensitive average cost criterion. Besides standard continuity-compactness conditions, the basic structural constraint on the decision model is that the transition law satisfies a simultaneous Doeblin condition. Within this framework, the main objective is to study the existence of bounded solutions to the risk-sensitive average cost optimality equation. Our main result guarantees a bounded solution to the optimality equation only if the risk sensitivity coefficient /spl lambda/ is small enough and, via a detailed example, it can be shown that such a conclusion cannot be extended to arbitrary values of /spl lambda/. Our results are in opposition to previous claims in the literature, but agree with recent results obtained via a direct probabilistic analysis. A key analysis tool developed in the paper is the definition of an appropriate operator with contractive properties, analogous to the dynamic programming operator in Bellman's equation, and a family of (value) functions with a discounted stochastic games interpretation.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.