Abstract
In this brief article, a novel adaptive integral reinforcement learning (AIRL) scheme is proposed for the continuous-time (CT) system. Moreover, it is used to learn the optimal controls of the partially unknown multi-input nonlinear system. Firstly, the Nash equilibrium of multi-input is defined. Two neural networks (NN) are used to approximate the cost functions with the integral reinforcement signal, which can avoid directly solving the Hamilton-Jacobi-Bellman (HJB) equation such that dynamic information and derivatives of NN activations are not needed. Then, a novel learning algorithm is used to update the unknown NN weights. The studied weights are used to obtain the optimum multi-policies. The learned weight convergence is proved. Finally, two examples are presented to verify the system performance with the proposed AIRL scheme.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have