An Asymptotically Efficient Simulation-Based Algorithm for Finite Horizon Stochastic Dynamic Programming

Hyeong Soo Chang,Jiaqiao Hu,Michael C Fu,Steven I Marcus

doi:10.1109/tac.2006.887917

An Asymptotically Efficient Simulation-Based Algorithm for Finite Horizon Stochastic Dynamic Programming

Hyeong Soo Chang, Jiaqiao Hu + Show 2 more

Open Access

https://doi.org/10.1109/tac.2006.887917

Copy DOI

Journal: IRE Transactions on Automatic Control	Publication Date: Jan 1, 2007
Citations: 34

Affiliation: Sogang University, State University of New York, Stony Brook University, University of Maryland, College Park

#Candidate Policies #Multiplicative Weight + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

We present a simulation-based algorithm called Simulated Annealing Multiplicative Weights (SAMW) for solving large finite-horizon stochastic dynamic programming problems. At each iteration of the algorithm, a probability distribution over candidate policies is updated by a simple multiplicative weight rule, and with proper annealing of a control parameter, the generated sequence of distributions converges to a distribution concentrated only on the best policies. The algorithm is asymptotically efficient, in the sense that for the goal of estimating the value of an optimal policy, a provably convergent finite-time upper bound for the sample mean is obtained

Full Text