Abstract
We consider a class of terminating Markov decision processes with an exponential risk-averse objective function and compact constraint sets. We assume the existence of an absorbing cost-free terminal state Ω, positive transition costs and continuity of the transition probability and cost functions. Without discounting future costs in the argument of the exponential utility function, we establish (i) the existence of a real-valued optimal cost function which can be achieved by a stationary policy and (ii) the convergence of value iteration and policy iteration to the unique solution of Bellman's equation. We illustrate the results with two computational examples.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have