Abstract

Control system theory has been based on certain well understood and accepted techniques such as transfer function-based methods, adaptive control, robust control, nonlinear systems theory and state-space methods. Besides these classical techniques, in recent decades, many successful results have been obtained by incorporating artificial neural networks in classical control structures. Due to their universal approximation property, neural network structures are the perfect candidates for designing controllers for complex nonlinear systems. These successful results have caused a number of control engineers to focus their interest on the results and algorithms of the machine learning and computational intelligence community and, at the same time, to find new inspiration in the biological neural structures of living organisms in their most evolved and complex form: the human brain. In this chapter we discuss two algorithms that were developed, based on a biologically inspired structure, with the purpose of learning the optimal state feedback controller for a linear system, while at the same time performing continuous-time online control for the system at hand. Moreover, since the algorithms are related to the reinforcement learning techniques in which an agent tries to maximize the total amount of reward received while interacting with an unknown environment, the optimal controller will be obtained while only making use of the input-to-state system dynamics. Mathematically speaking, the solution of the algebraic Riccati equation underlying the optimal control problem will be obtained without making use of any knowledge of the system internal dynamics. The two algorithms are built on iteration between the policy evaluation and policy update steps until updating the control policy no longer improves the system performance. Both algorithms can be characterized as direct adaptive optimal control types since the optimal control solution is determined without using an explicit, a priori obtained, model of the system internal dynamics. The effectiveness of the algorithms is shown and their performances compared while finding the optimal state feedback dynamics of an F-16 autopilot.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call