Abstract

We show that the history of play in a population game contains exploitable information that can be successfully used by sophisticated strategies to defeat memory-one opponents, including zero determinant strategies. The history allows a player to label opponents by their strategies, enabling a player to determine the population distribution and to act differentially based on the opponent’s strategy in each pairwise interaction. For the Prisoner’s Dilemma, these advantages lead to the natural formation of cooperative coalitions among similarly behaving players and eventually to unilateral defection against opposing player types. We show analytically and empirically that optimal play in population games depends strongly on the population distribution. For example, the optimal strategy for a minority player type against a resident TFT population is ALLC, while for a majority player type the optimal strategy versus TFT players is ALLD. Such behaviors are not accessible to memory-one strategies. Drawing inspiration from Sun Tzu’s the Art of War, we implemented a non-memory-one strategy for population games based on techniques from machine learning and statistical inference that can exploit the history of play in this manner. Via simulation we find that this strategy is essentially uninvadable and can successfully invade (significantly more likely than a neutral mutant) essentially all known memory-one strategies for the Prisoner’s Dilemma, including ALLC (always cooperate), ALLD (always defect), tit-for-tat (TFT), win-stay-lose-shift (WSLS), and zero determinant (ZD) strategies, including extortionate and generous strategies.

Highlights

  • The Prisoner’s Dilemma (PD) [1] is a two player game with a long history of study in evolutionary game theory [2] and finite populations [3]

  • In a tournament emulating the influential contest conducted by Axelrod [15], Stewart and Plotkin show that some zero determinant (ZD) strategies are very successful; Adami and Hintze [13] have shown that ZD strategies are evolutionarily unstable in general, but can be effective if opponents can be identified and play can depend on the opponent’s type

  • Fixation probabilities for zero-determinant strategies were studied by Stewart and Plotkin [20] in the case of weak selection

Read more

Summary

Introduction

The Prisoner’s Dilemma (PD) [1] is a two player game with a long history of study in evolutionary game theory [2] and finite populations [3]. There are many well-known strategies for the Prisoner’s Dilemma, such as ALLC (always cooperate), ALLD (always defect), tit-for-tat (TFT) [6] and win-stay-loseshift (WSLS) [7]. The discovery of zero determinant strategies by Press and Dyson [8] has invigorated the study of the Prisoner’s Dilemma, including the evolutionary stability of these strategies in population games and their relationship to and impact on the evolution of cooperation [2] [9] [10] [11] [12] [13] [14]. How a strategy fares against itself becomes crucial in population games

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.