Abstract

We study the minimization of a spectral risk measure of the total discounted cost generated by a Markov Decision Process (MDP) over a finite or infinite planning horizon. The MDP is assumed to have Borel state and action spaces and the cost function may be unbounded above. The optimization problem is split into two minimization problems using an infimum representation for spectral risk measures. We show that the inner minimization problem can be solved as an ordinary MDP on an extended state space and give sufficient conditions under which an optimal policy exists. Regarding the infinite dimensional outer minimization problem, we prove the existence of a solution and derive an algorithm for its numerical approximation. Our results include the findings in Bäuerle and Ott (Math Methods Oper Res 74(3):361–379, 2011) in the special case that the risk measure is Expected Shortfall. As an application, we present a dynamic extension of the classical static optimal reinsurance problem, where an insurance company minimizes its cost of capital.

Highlights

  • There have been various proposals to replace the expectation in the optimization of Markov Decision Processes (MDPs) by risk measures

  • The recursive approach for general MDP can for example be found in Ruszczynski (2010); Chu and Zhang (2014); Bäuerle and Glauner (2021)

  • The theory for these kind of models is rather different to the ones where the risk measures is applied to the total cost, since in the recursive approach we still get a recursive solution procedure directly

Read more

Summary

Introduction

There have been various proposals to replace the expectation in the optimization of Markov Decision Processes (MDPs) by risk measures. The recursive approach for general MDP can for example be found in Ruszczynski (2010); Chu and Zhang (2014); Bäuerle and Glauner (2021) The theory for these kind of models is rather different to the ones where the risk measures is applied to the total cost, since in the recursive approach we still get a recursive solution procedure directly. The inner problem is to minimize the expected convex function of the total cost It can be solved with MDP techniques after a suitable extension of the original state space. We treat the outer optimization problem and state the existence of an optimal function in the representation of the spectral risk measure. All proofs and detailed derivations of our results are deferred to the appendix

Spectral risk measures
Markov decision model
Inner problem
Solution of the extended MDP
Outer problem: existence and numerical approximation
Infinite planning horizon
Relaxed assumptions for monotone models
Dynamic optimal reinsurance
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call