Abstract

This paper studies the reinforcement learning of Erev and Roth with foregone payoff information in normal form games: players observe not only the realised payoffs but also foregone payoffs, the ones which they could have obtained if they had chosen the other actions. We provide conditions under which the reinforcement learning process almost surely converges to a regular quantal response equilibrium (Goeree et al. 2005). We also show that the reinforcement learning model and an adaptive learning model which nests experience-weighted attraction learning, payoff assessment learning and stochastic fictitious play learning models share the same asymptotic behaviour under the linear choice rule of the reinforcement learning model. In addition, we provide conditions under which the reinforcement learning process under the logit choice rule almost surely converges to a Nash equilibrium.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.