A EROSPACE engineering applications greatly stimulated the development of optimal control theory during the 1950s and 1960s, where the objective was to drive the system states in such a way that some defined cost was minimized. This turned out to have very useful applications in the design of regulators (where some steady state is to be maintained) and in tracking control strategies (where some predetermined state trajectory is to be followed). Among such applications was the problem of optimal flight trajectories for aircraft and space vehicles. Linear optimal control theory in particular has been very well documented and widely applied, where the plant that is controlled is assumed linear and the feedback controller is constrained to be linear with respect to its input. However, the availability of powerful low-cost microprocessors has spurred great advantages in the theory and applications of nonlinear control. The competitive era of rapid technological change, particularly in aerospace exploration, now demands stringent accuracy and cost requirements in nonlinear control systems. This has motivated the rapid development of nonlinear control theory for application to challenging, complex, dynamical real-world problems, particularly those that bear major practical significance in aerospace, marine, and defense industries. Infinite-time horizon nonlinear optimal control (ITHNOC) presents a viable option for synthesizing stabilizing controllers for nonlinear systems by making a state-input tradeoff, where the objective is to minimize the cost given by a performance index. The original theory of nonlinear optimal control dates from the 1960s. Various theoretical and practical aspects of the problem have been addressed in the literature over the decades since. In particular, the continuous-time nonlinear deterministic optimal control problem associated with autonomous (time-invariant) nonlinear regulator systems that are affine (linear) in the controls has been studied by many authors. The long-established theory of optimal control offers quite mature and well-documented techniques for solving this control-affine nonlinear optimization problem, based on dynamic programming or calculus of variations, but their application is generally a very tedious task. Bellman’s dynamic programming approach reduces to solving a nonlinear first-order partial differential equation (PDE), expressed by the Hamilton–Jacobi–Bellman (HJB) equation. The solution to the HJB equation gives the optimal performance/cost value (or storage) function and determines an optimal control in feedback form under some smoothness assumptions. Alternatively, in the classical calculus of variations, optimal control problems can be characterized locally in terms of the Hamiltonian dynamics arising from Pontryagin’s minimum principle. These are the characteristic equations of the HJB PDE, which result in a nonlinear, constrained two-point boundary value problem (TPBVP) that, in general, can only be solved by successive approximation of the optimal control input using iterative numerical techniques for each set of initial conditions. Numerically, even though the nonlinear TPBVP is somewhat easier to solve than the HJBPDE, control signals can only be determined offline and are thus best suited for feedforward control of plants for which the state trajectories are known a priori. Therefore, contrary to the dynamic programming approach, the resultant control law is not generally in feedback form. Open-loop control, however, is sensitive to random disturbances and requires that the initial state be on the optimal trajectory. In contrast, nonlinear optimal feedback has inherent robustness properties (inherent in the sense that it is obtained by ignoring uncertainty and disturbances). The potential difficulty with the HJB approach is that no efficient algorithm is available to solve the PDE when it is nonlinear and the problem dimension is high, making it impossible to derive exact expressions for optimal controls for most nontrivial problems of interest. The optimal can only be computed in special cases, such as linear dynamics and quadratic cost, or very low-dimensional systems. In particular, if the plant is linear time invariant (LTI) and the (infinite-time) performance index is quadratic, then the corresponding HJB equation for this infamous linear-quadratic regulator (LQR) problem reduces to an algebraic Riccati equation (ARE). Contrary to the well-developed and widely applied theory and computational tools for theRiccati equation (for example, see [1]), theHJB equation is difficult, if not impossible, to solve for most practical applications. The exact solution for the optimal control policies is very complex
Read full abstract