Abstract

In this paper, finite-horizon optimal control design for affine nonlinear discrete-time systems with totally unknown system dynamics is presented. First, a novel neural network (NN)-based identifier is utilized to learn the control coefficient matrix. This identifier is used together with the action-critic-based scheme to learn the time-varying solution, or referred to as value function, of the Hamilton-Jacobi-Bellman (HJB) equation in an online and forward in time manner. To handle the time varying nature of the value function, NNs with constant weights and time-varying activation functions are considered. To satisfy the terminal constraint, an additional term is added to the novel updating law. The uniformly ultimately boundedness of the closed-loop system is demonstrated by using standard Lyapunov theory. The effectiveness of the proposed method is verified by simulation results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call