SA-IGA: a multiagent reinforcement learning method towards socially optimal outcomes

Chengwei Zhang,Wanli Xue,Zhiyong Feng,Siqi Chen,Xiaohong Li,Karl Tuyls,Jianye Hao

doi:10.1007/s10458-019-09411-3

Abstract

In multiagent environments, the capability of learning is important for an agent to behave appropriately in face of unknown opponents and dynamic environment. From the system designer's perspective, it is desirable if the agents can learn to coordinate towards socially optimal outcomes, while also avoiding being exploited by selfish opponents. To this end, we propose a novel gradient ascent based algorithm (SA-IGA) which augments the basic gradient-ascent algorithm by incorporating social awareness into the policy update process. We theoretically analyze the learning dynamics of SA-IGA using dynamical system theory and SA-IGA is shown to have linear dynamics for a wide range of games including symmetric games. The learning dynamics of two representative games (the prisoner's dilemma game and the coordination game) are analyzed in details. Based on the idea of SA-IGA, we further propose a practical multiagent learning algorithm, called SA-PGA, based on Q-learning update rule. Simulation results show that SA-PGA agent can achieve higher social welfare than previous social-optimality oriented Conditional Joint Action Learner (CJAL) and also is robust against individually rational opponents by reaching Nash equilibrium solutions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

SA-IGA: a multiagent reinforcement learning method towards socially optimal outcomes

Abstract

Talk to us

Similar Papers

More From: Autonomous Agents and Multi-Agent Systems

Lead the way for us

Journal: Autonomous Agents and Multi-Agent Systems	Publication Date: May 15, 2019
Citations: 15

Similar Papers

Emergence of super cooperation of prisoner's dilemma games on scale-free networks.
Angsheng Li ... Xi Yong
PLOS ONE | VOL. 10
Angsheng Li, et. al.Angsheng Li ... Xi Yong
02 Feb 2015
PLOS ONE | VOL. 10

Locus of control and learning to cooperate in a prisoner's dilemma game
Christophe Boone ... Arjen Van Witteloostuijn
Personality and Individual Differences | VOL. 32
Christophe Boone, et. al.Christophe Boone ... Arjen Van Witteloostuijn
08 Mar 2002
Personality and Individual Differences | VOL. 32

Prior experience and patterning in a prisoner's dilemma game
Albert Silverstein ... David Cross
Journal of Behavioral Decision Making | VOL. 11
Albert Silverstein, et. al.Albert Silverstein ... David Cross
01 Jun 1998
Journal of Behavioral Decision Making | VOL. 11

7 - Experiments With Prisoner's Dilemma and Related Games
Andrew M Colman
Game Theory and Experimental Games | VOL. -
Andrew M ColmanAndrew M Colman
01 Jan 1981
Game Theory and Experimental Games | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SA-IGA: a multiagent reinforcement learning method towards socially optimal outcomes

Abstract

Talk to us

Similar Papers

More From: Autonomous Agents and Multi-Agent Systems