Abstract

Pavlov was proposed as a leading strategy for realizing cooperation because it dominates over a long period in evolutionary computer simulations of the Iterated Prisoners' Dilemma. However, our numerical calculations reveal that Pavlov and also any other cooperative strategy are not evolutionarily stable among all stochastic strategies with memory of only one previous move. We propose simple learning based on reinforcement. The learner changes its internal state, depending on an evaluation of whether the score in the previous round is larger than a critical value (aspiration level), which is genetically fixed. The current internal state decides the learner's move, but we found that the aspiration level determines its final behavior. The cooperative variant, having an intermediate aspiration level, is not an evolutionarily stable strategy (ESS) when evaluation is binary (good or bad). However, when the evaluation is quantified some cooperative variants can invade not only All-C, Tit-For-Tat (TFT), and Pavlov but also noncooperative variants with different aspiration levels. Moreover, they establish robust cooperation, which is evolutionarily stable against invasion by All-C, All-D, TFT, Pavlov, and noncooperative variants, and they receive a high score even when the error rate is high. Our results suggest that mutual cooperation can be maintained when players have a primitive learning ability.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.