Abstract

We consider reinforcement learning in games with both positive and negative payoffs. The Cross rule is the prototypical reinforcement learning rule in games that have only positive payoffs. We extend this rule to incorporate negative payoffs to obtain the generalized reinforcement learning rule. Applying this rule to a population game, we obtain the generalized reinforcement dynamic which describes the evolution of mixed strategies in the population. We apply the dynamic to the class of Rock–Scissor–Paper (RSP) games to establish local convergence to the interior rest point in all such games, including the bad RSP game.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call