Evolutionary Optimization of Cooperative Strategies for the Iterated Prisoner's Dilemma

Jessie Finocchiaro,H David Mathias

doi:10.1109/tg.2020.3005124

Abstract

The iterated prisoner's dilemma (IPD) has been studied in fields as diverse as economics, computer science, psychology, politics, and environmental studies. This is due, in part, to the intriguing property that its Nash equilibrium is not globally optimal. Typically treated as a single-objective problem, a player's goal is to maximize their own score. In some work, minimizing the opponent's score is an additional objective. In this article, we explore the role of explicitly optimizing for mutual cooperation in IPD player performance. We implement a genetic algorithm in which each member of the population evolves using one of four multiobjective fitness functions: selfish, communal, cooperative, and selfless, the last three of which use a cooperative metric as an objective. As a control, we also consider two single-objective fitness functions. We explore the role of representation in evolving cooperation by implementing four representations for evolving players. Finally, we evaluate the effect of noise on the evolution of cooperative behaviors. Testing our evolved players in tournaments in which a player's own score is the sole metric, we find that players evolved with mutual cooperation as an objective are very competitive. Thus, learning to play nicely with others is a successful strategy for maximizing personal reward.

Full Text