Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof

A Al-Tamimi,F.L Lewis,M Abu-Khalaf

doi:10.1109/tsmcb.2008.926614

Abstract

Convergence of the value-iteration-based heuristic dynamic programming (HDP) algorithm is proven in the case of general nonlinear systems. That is, it is shown that HDP converges to the optimal control and the optimal value function that solves the Hamilton-Jacobi-Bellman equation appearing in infinite-horizon discrete-time (DT) nonlinear optimal control. It is assumed that, at each iteration, the value and action update equations can be exactly solved. The following two standard neural networks (NN) are used: a critic NN is used to approximate the value function, whereas an action network is used to approximate the optimal control policy. It is stressed that this approach allows the implementation of HDP without knowing the internal dynamics of the system. The exact solution assumption holds for some classes of nonlinear systems and, specifically, in the specific case of the DT linear quadratic regulator (LQR), where the action is linear and the value quadratic in the states and NNs have zero approximation error. It is stressed that, for the LQR, HDP may be implemented without knowing the system A matrix by using two NNs. This fact is not generally appreciated in the folklore of HDP for the DT LQR, where only one critic NN is generally used.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)

Lead the way for us

Journal: IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)	Publication Date: Aug 1, 2008
Citations: 971

Similar Papers

Heuristic Dynamic Programming Nonlinear Optimal Controller
...
-
, et. al. ...
01 Jan 2009
01 Jan 2009

Discrete-time nonlinear HJB solution using Approximate dynamic programming: Convergence Proof
Asma Al-Tamimi ... Frank Lewis
-
Asma Al-Tamimi, et. al.Asma Al-Tamimi ... Frank Lewis
01 Apr 2007
01 Apr 2007

Heuristic dynamic programming for neural network vector control of a grid-connected converter
Xingang Fu ... Shuihui Li
-
Xingang Fu, et. al.Xingang Fu ... Shuihui Li
01 Jul 2014
01 Jul 2014

Novel iterative neural dynamic programming for data-based approximate optimal control design
Chaoxu Mu ... Haibo He
Automatica | VOL. 81
Chaoxu Mu, et. al.Chaoxu Mu ... Haibo He
20 Apr 2017
Automatica | VOL. 81

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)