The emergence of cooperation is a central issue in understanding collective behavior and evolution. The eco-evolutionary game model introduces a human–environment coupling mechanism, revealing that the feedback between strategies and the relevant environment is a key element in sustaining long-term cooperation. Previous theoretical studies have observed periodic oscillations between cooperative and defective actions under certain conditions. However, such investigations assume cooperators hold a benefit advantage over defectors, which does not fundamentally illuminate how cooperation emerges. Our paper emphasizes that understanding this issue requires considering inherent human memory characteristics. We refine the eco-evolutionary game model using reinforcement learning, constructing a multi-agent system that couples environment and memory-based decision-making. Comprehensive analyses encompass collective and individual perspectives. Our findings show that with the memory mechanism, oscillations between collective cooperation and defection can still occur, even if defection remains a strict Nash equilibrium. Cooperation emerges from the group’s random exploratory actions in depleted environments, altering the environment’s trends. A positive feedback loop forms among the environment, individual rewards, and actions, stabilizing cooperation as a favorable individual strategy at that point. However, established group cooperation leads individuals seeking optimal behavior to transition from cooperators to defectors through exploration, resulting in cooperation collapse. Subsequently, the memory mechanism reengages, diluting defectors’ expected payoffs and initiating a new round of exploratory behavior within the group. Our results unveil the micro-level mechanisms driving cyclic oscillations, enhancing our understanding of the environment-strategy interplay.
Read full abstract