Abstract

In many revenue management applications risk-averse decision-making is crucial. In dynamic settings, however, it is challenging to find the right balance between maximizing expected rewards and avoiding poor performances. In this paper, we consider time-consistent mean-semivariance (MSV) optimization for dynamic pricing problems within a discrete MDP framework, which are shown to be NP hard. We present a novel fixpoint-based dynamic programming approach to compute risk-sensitive feedback policies with Pareto-optimal combinations of mean and semivariance. We illustrate the effectiveness and the applicability of our concepts compared to state-of-the-art heuristics. For various numerical examples the results show that our approach clearly outperforms all other heuristics and obtains a performance guarantee with less then 0.2% optimality gap. Our approach is general and can be applied to MDPs beyond dynamic pricing.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call