Abstract

The paper focuses on the continuity properties of stochastic control problems with respect to initial probability measures. The continuity results are used to study the robustness of optimal control policies applied to systems with incorrect prior models. It is shown that for multi-stage optimal cost problems, weak convergence or setwise convergence is not sufficient for continuity and robustness in general, but that the optimal cost is continuous in the priors under the convergence in total variation under mild conditions. We also propose some sufficient conditions for the continuity of the optimal cost under weak convergence of priors. Using these continuity results we find bounds on the mismatch error that occurs due to the application of a control policy which is designed for an incorrectly estimated prior model in terms of a distance measure between true model and the incorrect one. Implications of these results in empirical learning for control will be presented, where almost surely weak convergence of i.i.d. empirical measures occurs but stronger notions of convergence, such as total variation convergence, in general, do not. These lead to practically important results on empirical learning in stochastic control since often, in engineering applications, system models are learned through training data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call