Abstract
Melioration learning is an empirically well-grounded model of reinforcement learning. By means of computer simulations, this paper derives predictions for several repeatedly played two-person games from this model. The results indicate a likely convergence to a pure Nash equilibrium of the game. If no pure equilibrium exists, the relative frequencies of choice may approach the predictions of the mixed Nash equilibrium. Yet in some games, no stable state is reached.
Highlights
Various learning models have been analysed in the game-theoretic literature
In contrast to previous specifications [32,33,34], this paper presents a formal representation of melioration learning that is perfectly consistent with Eq (1) and builds on a well-established algorithm of reinforcement learning
A simple process of completely uncoupled learning was investigated. It differs from previous models such as regret-testing or trial-and-error learning because, on the one hand, it is derived from empirical research and, on the other hand, the convergence to equilibrium states in social interactions is not guaranteed
Summary
The best known ones, such as fictitious play or Bayesian learning, describe normative processes that enable the players to find an equilibrium during the repeated play of a game [1] Those models presume that information about the preferences and past actions of all players is available. This paper strives for the usage of a simple psychological model of completely uncoupled learning. It is called melioration learning and may not converge towards equilibrium states. Established by Herrnstein and Vaughan [11], melioration learning is a theory of individual decision-making from behavioural psychology. 2: initialise Q1(j) 0, for all j 2 E 3: initialise K1(j) 0, for all j 2 E 4: repeat
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have