Abstract
This work is concerned with Markov decision chains on a denumerable state space. The controller has a positive risk-sensitivity coefficient, and the performance of a control policy is measured by a risk-sensitive average cost criterion. Besides standard continuity-compactness conditions, it is assumed that the state process is communicating under any stationary policy, and that the simultaneous Doeblin condition holds. In this context, it is shown that if the cost function is bounded from below, and the superior limit average index is finite at some point, then (i) the optimal superior and inferior limit average value functions coincide and are constant, (ii) the optimal average cost is characterized via an extended version of the Collatz-Wielandt formula in the theory of positive matrices, and (iii) an optimality inequality is established, from which a stationary optimal policy is obtained. Moreover, an explicit example is given to show that, even if the cost function is bounded, the strict inequality may occur in the optimality relation.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.