Single‐network ADP for near optimal control of continuous‐time zero‐sum games without using initial stabilising control laws

Chaoxu Mu,Ke Wang

doi:10.1049/iet-cta.2018.5464

Abstract

This study establishes an approximate optimal critic learning algorithm based on single-network adaptive dynamic programming aiming at solutions to continuous-time two-player zero-sum games in the absence of initial stabilising control policies. Single-network means one critic neural network, which is utilised to derive the saddle-point equilibrium of a zero-sum differential game by approximately learning the value function. First, the authors elaborate mathematically two-player zero-sum game problems and analyse the similarity of the zero-sum game problem between linear and non-linear systems. Then, this adaptive learning scheme is implemented as a critic structure that derives control and disturbance policies by learning the optimal value, and a novel weight tuning law involving a stable operator is proposed to ensure convergence and stability. Moreover, the uniform ultimate bounded stability of the whole system is rigorously proved by Lyapunov theory. Finally, reasonable simulation results are provided to confirm the effectiveness of the improved approximate optimal control technique in solving equations for a complex linear system and a non-linear system.

Full Text