Abstract
Multi-agent systems are widely studied due to its ability of solving complex tasks in many fields, especially in deep reinforcement learning. Recently, distributed optimization problem over multi-agent systems has drawn much attention because of its extensive applications. This paper presents a projection-based continuous-time algorithm for solving convex distributed optimization problem with equality and inequality constraints over multi-agent systems. The distinguishing feature of such problem lies in the fact that each agent with private local cost function and constraints can only communicate with its neighbors. All agents aim to cooperatively optimize a sum of local cost functions. By the aid of penalty method, the states of the proposed algorithm will enter equality constraint set in fixed time and ultimately converge to an optimal solution to the objective problem. In contrast to some existed approaches, the continuous-time algorithm has fewer state variables and the testification of the consensus is also involved in the proof of convergence. Ultimately, two simulations are given to show the viability of the algorithm.
Highlights
Reinforcement learning stems from an experiment on the behaviors of cats in 1898 by Thorndike [20]
deep reinforcement learning (DRL) is a interdiscipline of reinforcement learning and deep learning to cope with environments with high dimensions [17]
The distributed optimization problem is reformulated to a new one without inequality constraints and consensus constraints
Summary
Reinforcement learning stems from an experiment on the behaviors of cats in 1898 by Thorndike [20]. We construct a projection-based continuous-time algorithm to solve distributed optimization problems with equality and inequality constraints over multi-agent systems in this paper. To solve distributed optimization problem (3), a projectionbased continuous-time algorithm is proposed as follows: xi (t ). Proposition 4 The equilibrium point of continuous-time algorithm (18) is an optimal solution to distributed optimization problem (3) and vice versa. In this part, with the help of Lyapunov method and above preliminaries, we will study the convergence of continuoustime algorithm (17). Remark 4 It is worth noting that the property of entering one of the constraints or feasible region is possessed by many continuous-time algorithms for solving optimization problems, such as [8,11,19,34].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.