Online concurrent reinforcement learning algorithm to solve two‐player zero‐sum games for partially unknown nonlinear continuous‐time systems

Sholeh Yasini,Ali Karimpour,Hamidreza Modares,Mohammad‐Bagher Naghibi Sistani

doi:10.1002/acs.2485

Abstract

SummaryOnline adaptive optimal control methods based on reinforcement learning algorithms typically need to check for the persistence of excitation condition, which is necessary to be known a priori for convergence of the algorithm. However, this condition is often infeasible to implement or monitor online. This paper proposes an online concurrent reinforcement learning algorithm (CRLA) based on neural networks (NNs) to solve the H ∞ control problem of partially unknown continuous‐time systems, in which the need for persistence of excitation condition is relaxed by using the idea of concurrent learning. First, H ∞ control problem is formulated as a two‐player zero‐sum game, and then, online CRLA is employed to obtain the approximation of the optimal value and the Nash equilibrium of the game. The proposed algorithm is implemented on actor–critic–disturbance NN approximator structure to obtain the solution of the Hamilton–Jacobi–Isaacs equation online forward in time. During the implementation of the algorithm, the control input that acts as one player attempts to make the optimal control while the other player, that is, disturbance, tries to make the worst‐case possible disturbance. Novel update laws are derived for adaptation of the critic and actor NN weights. The stability of the closed‐loop system is guaranteed using Lyapunov technique, and the convergence to the Nash solution of the game is obtained. Simulation results show the effectiveness of the proposed method. Copyright © 2014 John Wiley & Sons, Ltd.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Online concurrent reinforcement learning algorithm to solve two‐player zero‐sum games for partially unknown nonlinear continuous‐time systems

Abstract

Talk to us

Similar Papers

More From: International Journal of Adaptive Control and Signal Processing

Lead the way for us

Journal: International Journal of Adaptive Control and Signal Processing	Publication Date: Apr 7, 2014
Citations: 20

Similar Papers

Optimal control of affine nonlinear discrete-time systems
Travis Dierks ... S Jagannthan
-
Travis Dierks, et. al.Travis Dierks ... S Jagannthan
01 Jun 2009
01 Jun 2009

Sliding-mode surface-based adaptive actor-critic optimal control for switched nonlinear systems with average dwell time
Haoyan Zhang ... Adil M Ahmad
Information Sciences | VOL. 580
Haoyan Zhang, et. al.Haoyan Zhang ... Adil M Ahmad
20 Aug 2021
Information Sciences | VOL. 580

Online optimal control of affine nonlinear discrete-time systems with unknown internal dynamics by using time-based policy update.
T Dierks ... S Jagannathan
IEEE Transactions on Neural Networks and Learning Systems | VOL. 23
T Dierks, et. al.T Dierks ... S Jagannathan
01 Jul 2012
IEEE Transactions on Neural Networks and Learning Systems | VOL. 23

Optimal Adaptive Control and Differential Games by Reinforcement Leanring Principles [Book review

IEEE Control Systems | VOL. 34

01 Jun 2014
IEEE Control Systems | VOL. 34

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Online concurrent reinforcement learning algorithm to solve two‐player zero‐sum games for partially unknown nonlinear continuous‐time systems

Abstract

Talk to us

Similar Papers

More From: International Journal of Adaptive Control and Signal Processing