Abstract
In this paper, an adaptive dynamic programming (ADP) algorithm based on value iteration (VI) is proposed to solve the infinite-time stochastic linear quadratic (SLQ) optimal control problem for the linear discrete-time systems with completely unknown system dynamics. Firstly, the SLQ control problem is converted into the deterministic problem through system transformation and then an iterative ADP algorithm is introduced to solve the optimal control problem with convergence analysis. Secondly, for the implementation of the iteration algorithm, a neural network (NN) is used to identify the unknown system and then the other two NNs are employed to approximate the cost function and the control gain matrix. Lastly, the effectiveness of the iterative ADP approach is illustrated by two simulation examples.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have