Abstract

In this brief article, a novel adaptive integral reinforcement learning (AIRL) scheme is proposed for the continuous-time (CT) system. Moreover, it is used to learn the optimal controls of the partially unknown multi-input nonlinear system. Firstly, the Nash equilibrium of multi-input is defined. Two neural networks (NN) are used to approximate the cost functions with the integral reinforcement signal, which can avoid directly solving the Hamilton-Jacobi-Bellman (HJB) equation such that dynamic information and derivatives of NN activations are not needed. Then, a novel learning algorithm is used to update the unknown NN weights. The studied weights are used to obtain the optimum multi-policies. The learned weight convergence is proved. Finally, two examples are presented to verify the system performance with the proposed AIRL scheme.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call