Finite-Horizon Optimal Consensus Control for Unknown Multiagent State-Delay Systems.

Huaipin Zhang,Ju H Park,Xiangpeng Xie,Dong Yue

doi:10.1109/tcyb.2018.2856510

Abstract

This paper investigates finite-horizon optimal consensus control problem for unknown multiagent systems with state delays. It is well known that optimal consensus control is the solutions to the coupled Hamilton-Jacobi-Bellman (HJB) equations. An off-policy reinforcement learning (RL) algorithm is developed to learn the two-stage optimal consensus solutions to the coupled time-varying HJB equations using the measurable state data instead of the knowledge of the state-delayed system dynamics. Subsequently, for each agent, a single critic neural network (NN) is utilized to approximate the time-varying cost function and help to calculate optimal consensus control policy. Based on the method of weighted residuals, adaptive weight update laws for the critic NNs are proposed. Finally, the simulation results are provided to illustrate the effectiveness of the proposed off-policy RL method.

Full Text