This paper considers the problem of linear quadratic tracking control (LQTC) with a discounted cost function for unknown systems. The existing design methods often require the discount factor to be small enough to guarantee the closed-loop stability. However, solving the discounted algebraic Riccati equation (ARE) may lead to ill-conditioned numerical issues if the discount factor is too small. By singular perturbation theory, we decompose the full-order discounted ARE into a reduced-order ARE and a Sylvester equation, which facilitate designing the feedback and feedforward control gains. The obtained controller is proved to be a stabilizing and near-optimal solution to the original LQTC problem. In the framework of reinforcement learning, both on-policy and off-policy two-phase learning algorithms are derived to design the near-optimal tracking control policy without knowing the discount factor. The advantages of the developed results are illustrated by comparative simulation results.
Read full abstract