Abstract

This article addresses the online reinforcement $Q$ -learning algorithms to design $H_{\infty }$ tracking controller for unknown discrete-time linear systems. An augmented system composed of the original system and the command generator is constructed, and a discounted performance function is introduced to establish a discounted game algebraic Riccati equation (GARE). The existence conditions of a solution to the GARE are proposed and a lower bound is found for the discount factor to assure the stability of the $H_{\infty }$ tracking control solution. The $Q$ -function Bellman equation is then derived, based on which the reinforcement $Q$ -learning algorithm is developed to learn the solution to $H_{\infty }$ tracking control problem without knowing the system dynamics. Both state-data-driven and output-data-driven reinforcement $Q$ -learning algorithms toward finding the control policies are proposed. Unlike the value function approximation (VFA)-based approach, it is proved that the $Q$ -learning scheme brings out no bias of solution to the $Q$ -function Bellman equation under the probing noise satisfying the persistent excitation (PE) condition, and therefore, converges to the nominal discounted GARE solution. Moreover, the proposed output-data-driven method is more powerful than the state-data-driven method as it may not be available to completely measure the full system states in practical applications. A simulation example with a single-phase voltage-source UPS inverter is used to verify the effectiveness of the proposed $Q$ -learning algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call