Output feedback reinforcement learning based optimal output synchronisation of heterogeneous discrete‐time multi‐agent systems

Syed Ali Asad Rizvi,Zongli Lin

doi:10.1049/iet-cta.2018.6266

Syed Ali Asad Rizvi, Zongli Lin

Open Access

https://doi.org/10.1049/iet-cta.2018.6266

Copy DOI

Journal: IET Control Theory & Applications	Publication Date: Nov 1, 2019
Citations: 11	License type: publisher-specific, author manuscript

Affiliation: University of Virginia

Abstract

This study proposes a model-free distributed output feedback control scheme that achieves synchronisation of the outputs of the heterogeneous follower agents with that of the leader agent in a directed network. A distributed two degree of freedom approach is presented that separates the learning of the optimal output feedback and the feedforward terms of the local control law for each agent. The local feedback parameters are learned using the proposed off-policy Q-learning algorithm, whereas a gradient adaptive law is presented to learn the local feedforward control parameters to achieve asymptotic tracking of each agent. This learning scheme and the resulting distributed control laws neither require access to the local internal state of the agents nor do they need an additional distributed leader state observer. The proposed approach has the advantage over the previous state augmentation approaches as it circumvents the need of introducing a discounting factor in the local performance functions. It is shown that the proposed algorithm converges to the optimal solution of the algebraic Riccati equation and the output regulator equations without explicitly solving them as long as the leader agent is reachable directly or indirectly from all the follower agents. Simulation results validate the proposed scheme.

Full Text