Abstract

In this paper, a novel output feedback solution based on the Q-learning algorithm using the measured data is proposed for the linear quadratic tracking (LQT) problem of unknown discrete-time systems. To tackle this technical issue, an augmented system composed of the original controlled system and the linear command generator is first constructed. Then, by using the past input, output, and reference trajectory data of the augmented system, the output feedback Q-learning scheme is able to learn the optimal tracking controller online without requiring any knowledge of the augmented system dynamics. Learning algorithms including both policy iteration (PI) and value iteration (VI) algorithms are developed to converge to the optimal solution. Finally, simulation results are provided to verify the effectiveness of the proposed scheme.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call