Abstract

This paper is devoted to H ∞ consensus design and online scheduling for homogeneous multiagent systems (MASs) with switching topologies via deep reinforcement learning. The model of homogeneous MASs with switching topologies is established based on switched systems theory, in which the switching of topologies is viewed as the switching among subsystems. By employing linear transformation, the closed-loop systems of MASs are converted into reduced-order systems. The problem of H ∞ consensus design can be transformed to the issue of H ∞ control. It is supposed that the consensus protocol is composed of two parts: dynamics-based protocol and learning-based protocol, where dynamics-based protocol is provided to guarantee the convergence and weighted attenuation and learning-based protocol is proposed to improve the transient performance. Then, the multiple Lyapunov function (MLF) method and mode-dependent average dwell time (MDADT) method are combined to ensure the stability and weighted H ∞ disturbance attenuation index of reduced-order systems. The sufficient existing conditions of dynamics-based protocol are given through the feasible solutions of linear matrix inequalities (LMIs). Moreover, the online scheduling is formulated as a Markov decision process, and the deep deterministic policy gradient (DDPG) algorithm in the framework of actor-critic is proposed for the compensation of disturbance to explore optimal control policy. The online scheduling of parameters of MASs is viewed as bounded compensation of dynamics-based protocol, whose stability can be guaranteed by nonfragile control theory. Finally, simulation results are provided to illustrate the effectiveness and superiority of the proposed method.

Highlights

  • As a typical class of complex networks, multiagent systems (MASs) have attracted increasing attention because of its great potential applications in numerous areas of engineering society and modern industry, such as formation control, attitude alignment, multiple missile attack, and mobile sensor networks [1–3]

  • We can see that the combination of dynamics-based consensus protocol and learningbased consensus protocol can ensure that the H∞ consensus and the transient performance can be improved by deep deterministic policy gradient (DDPG) algorithm

  • The problem of H∞ consensus design and online scheduling for MASs with switching topologies is studied in this paper based on robust control theory and deep reinforcement learning

Read more

Summary

Introduction

As a typical class of complex networks, MASs have attracted increasing attention because of its great potential applications in numerous areas of engineering society and modern industry, such as formation control, attitude alignment, multiple missile attack, and mobile sensor networks [1–3]. The sufficient conditions of stability analysis and the solutions of feedback protocol are formulated by LMIs. In [33], the problem of robust consensus control for second-order MASs with switching topologies is investigated. Compared to the traditional ADT method, the tighter bounds on dwell time and less restrictive results are obtained (3) The learning-based controller is proposed as supplementary controller based on DDPG algorithm to improve the transient performance. It is provided in the framework of actor-critic and can be viewed as uncertain compensation of dynamics-based controller.

Preliminaries and Problem Statement
Dynamics-Based Consensus Protocol Design
Λ11 Λ12 Λ13 Λ14
Λ11 Λ12 Λ13
Learning-Based Consensus Protocol Design
Numerical Example
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call