This article studies the optimal synchronization of linear heterogeneous multiagent systems (MASs) with partial unknown knowledge of the system dynamics. The object is to realize system synchronization as well as minimize the performance index of each agent. A framework of heterogeneous multiagent graphical games is formulated first. In the graphical games, it is proved that the optimal control policy relying on the solution of the Hamilton-Jacobian-Bellmen (HJB) equation is not only in Nash equilibrium, but also the best response to fixed control policies of its neighbors. To solve the optimal control policy and the minimum value of the performance index, a model-based policy iteration (PI) algorithm is proposed. Then, according to the model-based algorithm, a data-based off-policy integral reinforcement learning (IRL) algorithm is put forward to handle the partially unknown system dynamics. Furthermore, a single-critic neural network (NN) structure is used to implement the data-based algorithm. Based on the data collected by the behavior policy of the data-based off-policy algorithm, the gradient descent method is used to train NNs to approach the ideal weights. In addition, it is proved that all the proposed algorithms are convergent, and the weight-tuning law of the single-critic NNs can promote optimal synchronization. Finally, a numerical example is proposed to show the effectiveness of the theoretical analysis.
Read full abstract