Abstract

In this paper, a critic learning structure based on the novel utility function is developed to solve the optimal tracking control problem with the discount factor of affine nonlinear systems. The utility function is defined as the quadratic form of the error at the next moment, which can not only avoid solving the stable control input, but also effectively eliminate the tracking error. Next, the theoretical derivation of the method under value iteration is given in detail with convergence and stability analysis. Then, the dual heuristic dynamic programming (DHP) algorithm via a single neural network is introduced to reduce the amount of computation. The polynomial is used to approximate the costate function during the DHP implementation. The weighted residual method is used to update the weight matrix. During simulation, the convergence speed of the given strategy is compared with the heuristic dynamic programming (HDP) algorithm. The experiment results display that the convergence speed of the proposed method is faster than the HDP algorithm. Besides, the proposed method is compared with the traditional tracking control approach to verify its tracking performance. The experiment results show that the proposed method can avoid solving the stable control input, and the tracking error is closer to zero than the traditional strategy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call