We consider control of uncertain linear time-varying stochastic systems from the perspective of regret minimization. Specifically, we focus on the problem of designing a feedback controller that minimizes the loss relative to a clairvoyant optimal policy that has foreknowledge of both the system dynamics and the exogenous disturbances. In this competitive framework, establishing robustness guarantees proves challenging as, differently from the case where the model is known, the clairvoyant optimal policy is not only inapplicable, but also impossible to compute without knowledge of the system parameters. To address this challenge, we embrace a scenario optimization approach, and we propose minimizing regret robustly over a finite set of randomly sampled system parameters. We prove that this policy optimization problem can be solved through semidefinite programming, and that the corresponding solution retains strong probabilistic out-of-sample regret guarantees in face of the uncertain dynamics. Our method naturally extends to include satisfaction of safety constraints with high probability. We validate our theoretical results and showcase the potential of our approach by means of numerical simulations.
Read full abstract