Abstract

The closed-loop stability of an optimal policy provided by an Economic Nonlinear Model Predictive Control (ENMPC) scheme requires the existence of a storage function satisfying dissipativity conditions. Unfortunately, finding such a storage function is difficult in general. In contrast, tracking NMPC scheme uses a stage cost that is lower-bounded by a class-K∞ function and the closed-loop stability is fairly straightforward to establish. Under the dissipativity conditions, ENMPC has an equivalent tracking MPC that delivers the same optimal policy. In this paper, we use this idea and parameterize the stage cost and terminal cost of a tracking MPC with an additional parameterized storage function. We show that, if the parameterization of the tracking MPC scheme is rich enough to capture the exact optimal action-value function of the ENMPC scheme, then the parameterized storage function for the optimal parameters satisfies the dissipativity conditions for both discounted and undiscounted ENMPC schemes. In fact, we show that these conditions are met for dissipative problems. We propose to use Q-learning as a practical way of adjusting the parameters of the tracking MPC. Different numerical examples are provided to illustrate the efficiency of the proposed method, including LQR, non-dissipative, non-polynomial and a nonlinear chemical case studies. For instance, in the provided non-polynomial case study, the learning method can improve the storage function estimation by about 60% and 99.5% after 10 and 50 learning steps, respectively, compared with the Sum-of-Square method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call