ABSTRACTModel predictive control (MPC) is a well‐developed method capable of handling complex control tasks. Implementation of an MPC requires solution of a deterministic finite horizon nonlinear optimal control (OC) problem. OC problems can be solved globally, explicitly expressing the optimal policy (and value function) as a function of the present state, or, locally, using online trajectory optimization, generating a solution that is only valid for the present state. For nonlinear problems however, neither is possible analytically, and the latter, finding a local solution online, is usually preferred. The difficulty with online trajectory optimization is that the solution must be available within a single sampling period. The main parameter affecting the computational demand of MPC is the length of the prediction horizon. The goal in this work is to reduce the length of the prediction horizon of a model predictive controller (MPC) to reduce computation time whilst preserving optimality guarantees. To this end, we propose approximation of the trajectory optimization problem for MPC by learning a finite horizon value function. The approximated value function is inserted into a truncated trajectory optimization problem so that the MPC can be attained with a reduced prediction horizon, ergo reduced computational load. By sampling the value function in a goal oriented way, we show that an effective approximate value function can be found by including both the function value and the gradients of the value function. The result is an accurate approximate MPC which leverages learning methodologies to reduce computational cost while still accounting for constraints.
Read full abstract