Composing MPC With LQR and Neural Network for Amortized Efficiency and Stable Control

Fangyu Wu,Guanhua Wang,Alexandre Bayen,Alexander Keimer,Siyuan Zhuang,Kehan Wang,Ion Stoica

doi:10.1109/tase.2023.3259428

Abstract

Model predictive control (MPC) is a powerful control method that handles dynamical systems with constraints. However, solving MPC iteratively in real time, i.e., implicit MPC, remains a computational challenge. To address this, common solutions include explicit MPC and function approximation. Both methods, whenever applicable, may improve the computational efficiency of the implicit MPC by several orders of magnitude. Nevertheless, explicit MPC often requires expensive pre-computation and does not easily apply to higher-dimensional problems. Meanwhile, function approximation, although scales better with dimension, still requires pre-training on a large dataset and generally cannot guarantee to find an accurate surrogate policy, the failure of which often leads to closed-loop instability. To address these issues, we propose a triple-mode hybrid control scheme, named Memory-Augmented MPC, by combining a linear quadratic regulator, a neural network, and an MPC. From its standard form, we derive two variants of such hybrid control scheme: one customized for chaotic systems and the other for slow systems. The proposed scheme does not require pre-computation and is capable of improving the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">amortized</i> running time of the composed MPC with a well-trained neural network. In addition, the scheme maintains closed-loop stability with <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">any</i> neural networks of proper input and output dimensions, alleviating the need for certifying optimality of the neural network in safety-critical applications. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Note to Practitioners</i> —This article was motivated by the need to reduce the amortized cost of MPC in repetitive industrial robotic applications, where long-term operational cost is important and safety is critical. Examples of such applications include factory robotic arm manipulation and fixed-route quadcopter payload transport. Unlike explicit MPC or function approximation, our approach does not require any pre-computation or pre-training. Rather, it attains task proficiency over time by learning a surrogate neural network on the spot and by gradually replacing the costly MPC with the more efficient surrogate model so long as safety permits. Consequently, the proposed scheme incurs a learning cost during the initial phase of the deployment but usually becomes more adept on the task afterwards, leading to amortized efficiency.

Full Text