We consider a continuous-valued simulation optimization (SO) problem, where a simulator is built to optimize an expected performance measure of a real-world system while parameters of the simulator are estimated from streaming data collected periodically from the system. At each period, a new batch of data is combined with the cumulative data and the parameters are re-estimated with higher precision. The system requires the decision variable to be selected in all periods. Therefore, it is sensible for the decision-maker to update the decision variable at each period by solving a more precise SO problem with the updated parameter estimate to reduce the performance loss with respect to the target system. We define this decision-making process as the multi-period SO problem and introduce a multi-period stochastic approximation (SA) framework that generates a sequence of solutions. Two algorithms are proposed: Re-start SA ( ReSA ) reinitializes the stepsize sequence in each period, whereas Warm-start SA ( WaSA ) carefully tunes the stepsizes, taking both fewer and shorter gradient-descent steps in later periods as parameter estimates become increasingly more precise. We show that under suitable strong convexity and regularity conditions, ReSA and WaSA achieve the best possible convergence rate in expected sub-optimality either when an unbiased or a simultaneous perturbation gradient estimator is employed, while WaSA accrues significantly lower computational cost as the number of periods increases. In addition, we present the regularized ReSA which obviates the need to know the strong convexity constant and achieves the same convergence rate at the expense of additional computation.
Read full abstract