Off‐policy integral reinforcement learning‐based optimal tracking control for a class of nonzero‐sum game systems with unknown dynamics

Jin‐Gang Zhao,Fang‐Fang Chen

doi:10.1002/oca.2916

Abstract

AbstractThis article studies the optimal tracking control problem of a class of multi‐input nonlinear system with unknown dynamics based on reinforcement learning (RL) and nonzero‐sum game theory. First of all, an augmented system composed of the tracking error dynamics and the command generator dynamics is constructed. Then, a tracking coupled Hamilton–Jacobi (HJ) equations associated with discounted cost function is derived, which gives the Nash equilibrium solution. The existence of Nash equilibrium is proved. To approximate the Nash equilibrium solution of tracking coupled HJ equations, we give two model‐based policy iteration (PI) algorithms, and analyze their equivalence and convergence. Further, to get rid of the prior knowledge of system dynamics, an off‐policy integral reinforcement learning (OP‐IRL) algorithm implemented by neural networks (NNs) is proposed. The weights of critic NNs and actor NNs are updated simultaneously by the gradient descent method. The convergence of the NNs weights and the stability of the closed‐loop error systems are proved. Finally, numerical simulation results are provided to demonstrate the effectiveness of the proposed OP‐IRL method.

Full Text