In this article, we propose a novel learning-based model predictive control framework for nonlinear systems which is able to guarantee closed-loop learning of the controlled system. We consider a cost function that combines a general economic cost with a user-defined learning cost function that aims at incentivizing learning of the unknown system. In particular, due to the finite horizon of the MPC scheme and to the presence of disturbances, the open-loop trajectory usually differs from the closed-loop one. Such a mismatch causes existing learning-based MPC schemes to only show a learning phase in the open-loop prediction, without providing any formal guarantee on the actual closed-loop learning. In this article, we show how existing MPC schemes can be easily modified in order to guarantee closed-loop learning of the system by including a suitable discount factor in the chosen learning cost function, and implementing an additional constraint in the original MPC scheme. We show that various techniques for online learning the system dynamics, such as kinky inference methods, Gaussian processes, or parametric approaches, can be used within the proposed general framework.
Read full abstract