Convergence analysis of the deep neural networks based globalized dual heuristic programming

Jong Woo Kim,Tae Hoon Oh,Sang Hwan Son,Dong Hwi Jeong,Jong Min Lee

doi:10.1016/j.automatica.2020.109222

Abstract

Globalized dual heuristic programming (GDHP) algorithm is a special form of approximate dynamic programming (ADP) method that solves the Hamilton–Jacobi–Bellman (HJB) equation for the case where the system takes control-affine form subject to the quadratic cost function. This study incorporates the deep neural networks (DNNs) as a function approximator to inherit the advantages of which to express high-dimensional function space. Elementwise error bound of the costate function sequence is newly derived and the convergence property is presented. In the approximated function space, uniformly ultimate boundedness (UUB) condition for the weights of the general multi-layer NNs weights is obtained. It is also proved that under the gradient descent method for solving the moving target regression problem, UUB gradually converges to the value, which exclusively contains the approximation reconstruction error. The proposed method is demonstrated on the continuous reactor control in aims to obtain the control policy for multiple initial states, which justifies the necessity of DNNs structure for such cases.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Convergence analysis of the deep neural networks based globalized dual heuristic programming

Abstract

Talk to us

Similar Papers

More From: Automatica

Lead the way for us

Journal: Automatica	Publication Date: Aug 26, 2020
Citations: 9

Similar Papers

Approximately Optimal Control of Discrete-Time Nonlinear Switched Systems Using Globalized Dual Heuristic Programming
Chaoxu Mu ... Kaiju Liao
Neural Processing Letters | VOL. 52
Chaoxu Mu, et. al.Chaoxu Mu ... Kaiju Liao
30 Jul 2020
Neural Processing Letters | VOL. 52

Approximate Dynamic Programming with (min; +) linear function approximation for Markov decision processes
L Chandrashekar ... Shalabh Bhatnagar
-
L Chandrashekar, et. al.L Chandrashekar ... Shalabh Bhatnagar
01 Dec 2014
01 Dec 2014

Adaptive Identifier-Critic-Based Optimal Tracking Control for Nonlinear Systems With Experimental Validation
Jing Na ... Yongfeng Lv
IEEE Transactions on Systems, Man, and Cybernetics: Systems | VOL. 52
Jing Na, et. al.Jing Na ... Yongfeng Lv
02 Jul 2020
IEEE Transactions on Systems, Man, and Cybernetics: Systems | VOL. 52

Approximate Dynamic Programming for Commodity and Energy Merchant Operations

-

01 Apr 2014
01 Apr 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Convergence analysis of the deep neural networks based globalized dual heuristic programming

Abstract

Talk to us

Similar Papers

More From: Automatica