Recent paradigm shifts from imitation learning to reinforcement learning (RL) is shown to be productive in understanding human behaviors. In the RL paradigm, individuals search for optimal strategies through interaction with the environment to make decisions. This implies that gathering, processing, and utilizing information from their surroundings is crucial. However, existing studies on public good games using the Q-learning algorithm typically adopt a self-regarding setup, where individuals adjust their strategies based solely on their own strategic information, neglecting the environmental factors. In this work, we investigate the evolution of cooperation with the multiplayer game — the public goods game using the Q-learning algorithm by leveraging the environmental information. Specifically, the decision-making of players is based on the cooperation information in their neighborhood. Our results show that cooperation is more likely to emerge compared to the case of imitation learning by using Fermi-function-based update rule. Of particular interest is the observation of an anomalous non-monotonic dependence which is revealed when voluntary participation is further introduced. The analysis of the Q-table explains the mechanisms behind the cooperation evolution. Our findings indicate the fundamental role of environment information in the RL paradigm to understand the evolution of cooperation, and human behaviors in general.
Read full abstract