The curse of dimensionality significantly restricts the use of dynamic programming methods in solving complex problems. Consequently, researchers and practitioners often resort to approximate (suboptimal) control policies that strike a balance between ease of implementation and satisfactory performance. Information relaxation-based duality techniques generate both upper and lower bounds for the true values of stochastic dynamic programming problems, allowing us to evaluate the optimality of approximate policies through the dual gap. However, the literature still lacks guidance on handling cases where the gaps are excessively loose. In “Information Relaxation and a Duality-Driven Algorithm for Stochastic Dynamic Programs,” Chen, Ma, Liu, and Yu develop a novel DDP framework to obtain and tighten confidence interval estimates for the true value functions of SDP problems. Leveraging a new finding that the dual operation yields subsolutions, they establish convergence guarantees for DDP. Additionally, a regression-based Monte Carlo method is introduced, aimed at high-dimensional applications. Numerical examples demonstrate that DDP effectively improves dual gaps for various heuristics that are commonly used in the literature.
Read full abstract