Neuro-Optimal Control for Discrete Stochastic Processes via a Novel Policy Iteration Algorithm

Mingming Liang,Derong Liu,Ding Wang

doi:10.1109/tsmc.2019.2907991

Abstract

In this paper, a novel policy iteration adaptive dynamic programming (ADP) algorithm is presented which is called “local policy iteration ADP algorithm” to obtain the optimal control for discrete stochastic processes. In the proposed local policy iteration ADP algorithm, the iterative decision rules are updated in a local space of the whole state space. Hence, we can significantly reduce the computational burden for the CPU in comparison with the conventional policy iteration algorithm. By analyzing the convergence properties of the proposed algorithm, it is shown that the iterative value functions are monotonically nonincreasing. Besides, the iterative value functions can converge to the optimum in a local policy space. In addition, this local policy space will be described in detail for the first time. Under a few weak constraints, it is also shown that the iterative value function will converge to the optimal performance index function of the global policy space. Finally, a simulation example is presented to validate the effectiveness of the developed method.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Neuro-Optimal Control for Discrete Stochastic Processes via a Novel Policy Iteration Algorithm

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans : a publication of the IEEE Systems, Man, and Cybernetics Society

Lead the way for us

Journal: IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans : a publication of the IEEE Systems, Man, and Cybernetics Society	Publication Date: May 29, 2019
Citations: 32

Similar Papers

Discrete-Time Local Value Iteration Adaptive Dynamic Programming: Admissibility and Termination Analysis.
Qinglai Wei ... Qiao Lin
IEEE transactions on neural networks | VOL. 28
Qinglai Wei, et. al.Qinglai Wei ... Qiao Lin
01 Nov 2017
IEEE transactions on neural networks | VOL. 28

Discrete-Time Optimal Control via Local Policy Iteration Adaptive Dynamic Programming.
Qinglai Wei ... Ruizhuo Song
IEEE transactions on cybernetics | VOL. 47
Qinglai Wei, et. al.Qinglai Wei ... Ruizhuo Song
18 Jul 2016
IEEE transactions on cybernetics | VOL. 47

Generalized policy iteration adaptive dynamic programming algorithm for optimal tracking control of a class of nonlinear systems
Qiao Lin ... Derong Liu
-
Qiao Lin, et. al.Qiao Lin ... Derong Liu
01 May 2016
01 May 2016

A novel optimal tracking control scheme for a class of discrete-time nonlinear systems using generalised policy iteration adaptive dynamic programming algorithm
Qiao Lin ... Derong Liu
International journal of systems science | VOL. 48
Qiao Lin, et. al.Qiao Lin ... Derong Liu
24 May 2016
International journal of systems science | VOL. 48

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Neuro-Optimal Control for Discrete Stochastic Processes via a Novel Policy Iteration Algorithm

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans : a publication of the IEEE Systems, Man, and Cybernetics Society