Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin.

Takahiro Ezaki,Masanori Takezawa,Yutaka Horita,Naoki Masuda,Natalia L Komarova

doi:10.1371/journal.pcbi.1005034

Abstract

Direct reciprocity, or repeated interaction, is a main mechanism to sustain cooperation under social dilemmas involving two individuals. For larger groups and networks, which are probably more relevant to understanding and engineering our society, experiments employing repeated multiplayer social dilemma games have suggested that humans often show conditional cooperation behavior and its moody variant. Mechanisms underlying these behaviors largely remain unclear. Here we provide a proximate account for this behavior by showing that individuals adopting a type of reinforcement learning, called aspiration learning, phenomenologically behave as conditional cooperator. By definition, individuals are satisfied if and only if the obtained payoff is larger than a fixed aspiration level. They reinforce actions that have resulted in satisfactory outcomes and anti-reinforce those yielding unsatisfactory outcomes. The results obtained in the present study are general in that they explain extant experimental results obtained for both so-called moody and non-moody conditional cooperation, prisoner’s dilemma and public goods games, and well-mixed groups and networks. Different from the previous theory, individuals are assumed to have no access to information about what other individuals are doing such that they cannot explicitly use conditional cooperation rules. In this sense, myopic aspiration learning in which the unconditional propensity of cooperation is modulated in every discrete time step explains conditional behavior of humans. Aspiration learners showing (moody) conditional cooperation obeyed a noisy GRIM-like strategy. This is different from the Pavlov, a reinforcement learning strategy promoting mutual cooperation in two-player situations.

Highlights

Humans very often cooperate with each other when free-riding on others’ efforts is ostensibly lucrative
We show that players adopting a type of reinforcement learning exhibit these conditional cooperation behaviors
We provide an account for experimentally observed conditional cooperation (CC) and moody conditional cooperation (MCC) patterns using a family of reinforcement learning called the aspiration learning [27–36]

Summary

Introduction

Humans very often cooperate with each other when free-riding on others’ efforts is ostensibly lucrative. Among various mechanisms enabling cooperation in social dilemma situations, direct reciprocity, i.e., repeated interaction between a pair of individuals, is widespread. Past theoretical research using the two-player prisoner’s dilemma game (PDG) identified tit-for-tat (TFT) [2], generous TFT [3], a win-stay lose-shift strategy often called Pavlov [4–6] as representative strong competitors in the repeated two-player PDG. Direct reciprocity in larger groups corresponds to the individual’s action rule collectively called the conditional cooperation (CC), a multiplayer variant of TFT. An individual employing CC would cooperate if a large amount of cooperation has been made by other group members. Depending on the parameter values, the outcome of the learning process shows CC patterns and their variant that have been observed in behavioral experiments

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLOS Computational Biology	Publication Date: Jul 20, 2016
Citations: 49	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS Computational Biology

Lead the way for us

Similar Papers

Evolution of Conformity in Social Dilemmas.
Yali Dong ... Marco Tomassini
PloS one | VOL. 10
Yali Dong, et. al.Yali Dong ... Marco Tomassini
01 Sep 2015
PloS one | VOL. 10

Spatial evolutionary game theory: Hawks and Doves revisited
...
Proceedings of the Royal Society of London. Series B: Biological Sciences | VOL. 263
, et. al. ...
22 Sep 1996
Proceedings of the Royal Society of London. Series B: Biological Sciences | VOL. 263

Emergence of super cooperation of prisoner's dilemma games on scale-free networks.
Angsheng Li ... Gui-Quan Sun
PLOS ONE | VOL. 10
Angsheng Li, et. al.Angsheng Li ... Gui-Quan Sun
02 Feb 2015
PLOS ONE | VOL. 10

Learning dynamics explains human behaviour in prisoner's dilemma on networks.
Giulio Cimini ... Angel Sánchez
Journal of The Royal Society Interface | VOL. 11
Giulio Cimini, et. al.Giulio Cimini ... Angel Sánchez
06 May 2014
Journal of The Royal Society Interface | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS Computational Biology