Abstract

This paper investigates the output feedback (OPFB) tracking control problem for discrete-time linear (DTL) systems with unknown dynamics. With the approach of augmented system, the tracking control problem is first turned into a regulation problem with a discounted performance function, the solution of which relies on the Q-function based Bellman equation. Then, a novel value iteration (VI) scheme based on reinforcement Q-learning mechanism is proposed for solving the Q-function Bellman equation without knowing the system dynamics. Moreover, the convergence of the VI based Q-learning is proved by indicating that it converges to the Q-function Bellman equation and it brings out no bias of solution even under the probing noise satisfying the persistent excitation (PE) condition. As a result, the OPFB tracking controller can be learned online by using the past input, output, and reference trajectory data of the augmented system. The proposed scheme removes the requirement of initial admissible policy in the policy iteration (PI) method. Finally, effectiveness of the proposed scheme is demonstrated through a simulation example.

Highlights

  • For controller design problem, optimization of performance costs has been an important concern since it may lead to reduction in energy effort which leads to positive consequences on earth environment

  • SIMULATION RESULTS we propose a simulation example to verify the effectiveness of developed output feedback (OPFB) Q-learning algorithm based on value iteration (VI) scheme

  • Compared with the policy iteration (PI)-based Algorithm 1, it is verified that the VI-based Algorithm 2 removes the requirement of initial stabilizing control policy

Read more

Summary

Introduction

Optimization of performance costs has been an important concern since it may lead to reduction in energy effort which leads to positive consequences on earth environment. The solution of Ricatti equation can be efficiently obtained by the iteratively computational algorithms [2], [3], which are only applicable to the cases where complete knowledge of system dynamics is known. It is often desirable in control engineering to design online learning controllers without resorting to the system dynamics [4]–[8]. Notice that a data-based method has been proposed in [9] to analysis

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call