A Parallel Framework of Adaptive Dynamic Programming Algorithm With Off-Policy Learning.

Changyin Sun,Xiaofeng Li,Yuewen Sun

doi:10.1109/tnnls.2020.3015767

Abstract

In this article, a model-free online adaptive dynamic programming (ADP) approach is developed for solving the optimal control problem of nonaffine nonlinear systems. Combining the off-policy learning mechanism with the parallel paradigm, multithread agents are employed to collect the transitions by interacting with the environment that significantly augments the number of sampled data. On the other hand, each thread agent explores the environment with different initial states under its own behavior policy that enhances the exploration capability and alleviates the correlation between the sampled data. After the policy evaluation process, only one step update is required for policy improvement based on the policy gradient method. The stability of the system under iterative control laws is guaranteed. Moreover, the convergence analysis is given to prove that the iterative Q-function is monotonically nonincreasing and finally converges to the solution of the Hamilton-Jacobi-Bellman (HJB) equation. For implementing the algorithm, the actor-critic (AC) structure is utilized with two neural networks (NNs) to approximate the Q-function and the control policy. Finally, the effectiveness of the proposed algorithm is verified by two numerical examples.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Parallel Framework of Adaptive Dynamic Programming Algorithm With Off-Policy Learning.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Neural Networks and Learning Systems

Lead the way for us

Journal: IEEE Transactions on Neural Networks and Learning Systems	Publication Date: Aug 24, 2020
Citations: 19

Similar Papers

Discrete-Time Local Value Iteration Adaptive Dynamic Programming: Admissibility and Termination Analysis.
Qinglai Wei ... Derong Liu
IEEE Transactions on Neural Networks and Learning Systems | VOL. 28
Qinglai Wei, et. al.Qinglai Wei ... Derong Liu
01 Nov 2017
IEEE Transactions on Neural Networks and Learning Systems | VOL. 28

Event-triggered optimal control for nonlinear stochastic systems via adaptive dynamic programming
Guoping Zhang ... Quanxin Zhu
Nonlinear Dynamics | VOL. 105
Guoping Zhang, et. al.Guoping Zhang ... Quanxin Zhu
25 Jun 2021
Nonlinear Dynamics | VOL. 105

Online Optimal Control of Continuous-Time Affine Nonlinear Systems
Derong Liu ... Xiong Yang
-
Derong Liu, et. al.Derong Liu ... Xiong Yang
01 Jan 2017
01 Jan 2017

Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems
Qinglai Wei ... Derong Liu
IEEE Transactions on Cybernetics | VOL. 46
Qinglai Wei, et. al.Qinglai Wei ... Derong Liu
02 Nov 2015
IEEE Transactions on Cybernetics | VOL. 46

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Parallel Framework of Adaptive Dynamic Programming Algorithm With Off-Policy Learning.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Neural Networks and Learning Systems