Abstract

Multi-agent systems are widely studied due to its ability of solving complex tasks in many fields, especially in deep reinforcement learning. Recently, distributed optimization problem over multi-agent systems has drawn much attention because of its extensive applications. This paper presents a projection-based continuous-time algorithm for solving convex distributed optimization problem with equality and inequality constraints over multi-agent systems. The distinguishing feature of such problem lies in the fact that each agent with private local cost function and constraints can only communicate with its neighbors. All agents aim to cooperatively optimize a sum of local cost functions. By the aid of penalty method, the states of the proposed algorithm will enter equality constraint set in fixed time and ultimately converge to an optimal solution to the objective problem. In contrast to some existed approaches, the continuous-time algorithm has fewer state variables and the testification of the consensus is also involved in the proof of convergence. Ultimately, two simulations are given to show the viability of the algorithm.

Highlights

  • Reinforcement learning stems from an experiment on the behaviors of cats in 1898 by Thorndike [20]

  • deep reinforcement learning (DRL) is a interdiscipline of reinforcement learning and deep learning to cope with environments with high dimensions [17]

  • The distributed optimization problem is reformulated to a new one without inequality constraints and consensus constraints

Read more

Summary

Introduction

Reinforcement learning stems from an experiment on the behaviors of cats in 1898 by Thorndike [20]. We construct a projection-based continuous-time algorithm to solve distributed optimization problems with equality and inequality constraints over multi-agent systems in this paper. To solve distributed optimization problem (3), a projectionbased continuous-time algorithm is proposed as follows: xi (t ). Proposition 4 The equilibrium point of continuous-time algorithm (18) is an optimal solution to distributed optimization problem (3) and vice versa. In this part, with the help of Lyapunov method and above preliminaries, we will study the convergence of continuoustime algorithm (17). Remark 4 It is worth noting that the property of entering one of the constraints or feasible region is possessed by many continuous-time algorithms for solving optimization problems, such as [8,11,19,34].

Objective functions
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.