Bi-Level Actor-Critic for Multi-Agent Coordination

Haifeng Zhang,Zeren Huang,Weizhe Chen,Minne Li,Jun Wang,Yaodong Yang,Weinan Zhang

doi:10.1609/aaai.v34i05.6226

Abstract

Coordination is one of the essential problems in multi-agent systems. Typically multi-agent reinforcement learning (MARL) methods treat agents equally and the goal is to solve the Markov game to an arbitrary Nash equilibrium (NE) when multiple equilibra exist, thus lacking a solution for NE selection. In this paper, we treat agents unequally and consider Stackelberg equilibrium as a potentially better convergence point than Nash equilibrium in terms of Pareto superiority, especially in cooperative environments. Under Markov games, we formally define the bi-level reinforcement learning problem in finding Stackelberg equilibrium. We propose a novel bi-level actor-critic learning method that allows agents to have different knowledge base (thus intelligent), while their actions still can be executed simultaneously and distributedly. The convergence proof is given, while the resulting learning algorithm is tested against the state of the arts. We found that the proposed bi-level actor-critic algorithm successfully converged to the Stackelberg equilibria in matrix games and find a asymmetric solution in a highway merge environment.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Bi-Level Actor-Critic for Multi-Agent Coordination

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence	Publication Date: Apr 3, 2020
Citations: 29

Similar Papers

Lessons learned in single-agent and multiagent learning with robot foraging
Z Ren ... A.B Williams
-
Z Ren, et. al.Z Ren ... A.B Williams
10 Nov 2003
10 Nov 2003

Evaluating semi-cooperative Nash/Stackelberg Q-learning for traffic routes plan in a single intersection
Jian Guo ... Istvan Harmati
Control Engineering Practice | VOL. 102
Jian Guo, et. al.Jian Guo ... Istvan Harmati
30 Jun 2020
Control Engineering Practice | VOL. 102

Inducing Stackelberg Equilibrium through Spatio-Temporal Sequential Decision-Making in Multi-Agent Reinforcement Learning
Bin Zhang ... Zhiwei Xu
-
Bin Zhang, et. al.Bin Zhang ... Zhiwei Xu
01 Aug 2023
01 Aug 2023

Method of Multi-Agent Reinforcement Learning in Systems with a Variable Number of Agents
V I Petrenko ... F B Tebueva
MEHATRONIKA, AVTOMATIZACIA, UPRAVLENIE | VOL. 23
V I Petrenko, et. al.V I Petrenko ... F B Tebueva
09 Oct 2022
MEHATRONIKA, AVTOMATIZACIA, UPRAVLENIE | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bi-Level Actor-Critic for Multi-Agent Coordination

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence