Finding Best Answers for the Iterated Prisoner’s Dilemma Using Improved Q-Learning

Martin Kies

doi:10.2139/ssrn.3556714

Finding Best Answers for the Iterated Prisoner’s Dilemma Using Improved Q-Learning

Martin Kies

https://doi.org/10.2139/ssrn.3556714

Copy DOI

Journal: SSRN	Publication Date: Apr 9, 2020
Citations: 3

#Dilemma Game #Limit Case + Show 7 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Given an arbitrary black-box strategy for the Iterated Prisoner’s Dilemma game, it is often difficult to gauge to which extent it can be exploited by other strategies. In the presence of imperfect public monitoring and resulting observation errors, deriving a theoretical solution is even more time-consuming. However, for any strategy the reinforcement learning algorithm Q-Learning can construct a best response in the limit case. In this article I present and discuss several improvements to the Q-Learning algorithm, allowing for an easy numerical measure of the exploitability of a given strategy. Additionally, I give a detailed introduction to reinforcement learning aimed at economists.

Full Text