Abstract

This paper studies the average value-at-risk (AVaR) criterion for finite horizon semi-Markov decision processes (SMDPs) in continuous time. Via an alternative representation of AVaR, we reduce the problem of minimizing the AVaR of the finite horizon cost to two subproblems: one is to minimize the expected-positive-deviation of the finite horizon cost from some level over policies, which itself is a new and interesting problem in the finite horizon SMDP setting; the second is an ordinary problem of minimizing a function of a single variable. For the first subproblem, by the technique of extending the state space to include the cost level, we prove that the value function is a minimum solution to the optimality equation, and an optimal policy exists under suitable conditions. Furthermore, we show that the value function is the unique solution in a metric space to the optimality equation when one more condition is imposed, which plays a key role for the algorithm complexity analysis and the policy improvemen...

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call