Extending Q -learning to continuous and mixed strategy games based on spatial reciprocity

Lu Wang,Yang Liu,Long Zhang,Zhen Wang

doi:10.1098/rspa.2022.0667

Abstract

The discrete strategy game, in which agents can only choose cooperation or defection, has received lots of attention. However, this hypothesis seems implausible in the real world, where choices may be continuous or mixed. Furthermore, when applying Q -learning to continuous or mixed strategy games, one of the challenges is that the learning space grows drastically as the number of states and actions rises. So, in this article, we redesign the Q -learning method by considering the spatial reciprocity, in which agents simply interact with their four neighbours to get the reward and learn the action by taking neighbours’ strategy into account. As a result, the learning state and action space is transformed into a 5 × 5 table that stores the state and action of the focal agent and its four neighbours, avoiding the curse of dimensionality caused by a continuous or mixed strategy game. The numerical simulation results reveal the striking differences between the three classes of games. In detail, the discrete strategy game is more vulnerable to the setting of relevant parameters, whereas the other two strategy games are relatively stable. At the same time, in terms of promoting cooperation, a mixed strategy game is always better than a continuous one.

Full Text