Abstract

We consider continuous-time stochastic optimal control problems featuring conditional value-at-risk (CVaR) in the objective. The major difficulty in these problems arises from time inconsistency, which prevents us from directly using dynamic programming. To resolve this challenge, we convert to an equivalent bilevel optimization problem in which the inner optimization problem is standard stochastic control. Furthermore, we provide conditions under which the outer objective function is convex and differentiable. We compute the outer objective's value via a Hamilton--Jacobi--Bellman equation and its gradient via the viscosity solution of a linear parabolic equation, which allows us to perform gradient descent. The significance of this result is that we provide an efficient dynamic-programming-based algorithm for optimal control of CVaR without lifting the state space. To broaden the applicability of the proposed algorithm, we propose convergent approximation schemes in cases where our key assumptions do not hold and characterize relevant suboptimality bounds. In addition, we extend our method to a more general class of risk metrics, which includes mean variance and median deviation. We also demonstrate a concrete application to portfolio optimization under CVaR constraints. Our results contribute an efficient framework for solving time-inconsistent CVaR-based sequential optimization.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call