The actions of others act as a pseudo-reward to drive imitation in the context of social reinforcement learning

Emmanuelle Bonnet,Stefano Palminteri,Bahador Bahrami,Anis Najar,Gabriel Gasque

doi:10.1371/journal.pbio.3001028.r007

Emmanuelle Bonnet, Stefano Palminteri + Show 3 more

Open Access

https://doi.org/10.1371/journal.pbio.3001028.r007

Copy DOI

Abstract

While there is no doubt that social signals affect human reinforcement learning, there is still no consensus about how this process is computationally implemented. To address this issue, we compared three psychologically plausible hypotheses about the algorithmic implementation of imitation in reinforcement learning. The first hypothesis, decision biasing (DB), postulates that imitation consists in transiently biasing the learner’s action selection without affecting their value function. According to the second hypothesis, model-based imitation (MB), the learner infers the demonstrator’s value function through inverse reinforcement learning and uses it to bias action selection. Finally, according to the third hypothesis, value shaping (VS), the demonstrator’s actions directly affect the learner’s value function. We tested these three hypotheses in 2 experiments (N = 24 and N = 44) featuring a new variant of a social reinforcement learning task. We show through model comparison and model simulation that VS provides the best explanation of learner’s behavior. Results replicated in a third independent experiment featuring a larger cohort and a different design (N = 302). In our experiments, we also manipulated the quality of the demonstrators’ choices and found that learners were able to adapt their imitation rate, so that only skilled demonstrators were imitated. We proposed and tested an efficient meta-learning process to account for this effect, where imitation is regulated by the agreement between the learner and the demonstrator. In sum, our findings provide new insights and perspectives on the computational mechanisms underlying adaptive imitation in human reinforcement learning.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

The actions of others act as a pseudo-reward to drive imitation in the context of social reinforcement learning

Abstract

Talk to us

Similar Papers

Lead the way for us

Publication Date: Dec 8, 2020
Citations: 19	License type: CC BY 4.0

Similar Papers

The actions of others act as a pseudo-reward to drive imitation in the context of social reinforcement learning.
Anis Najar ... Matthew F S Rushworth
PLOS Biology | VOL. 18
Anis Najar, et. al.Anis Najar ... Matthew F S Rushworth
08 Dec 2020
PLOS Biology | VOL. 18

Reinforcement Learning for Clinical Applications.
Kia Khezeli ... Benjamin Shickel
Clinical journal of the American Society of Nephrology : CJASN | VOL. 18
Kia Khezeli, et. al.Kia Khezeli ... Benjamin Shickel
08 Feb 2023
Clinical journal of the American Society of Nephrology : CJASN | VOL. 18

Using Inverse Reinforcement Learning with Real Trajectories to Get More Trustworthy Pedestrian Simulations
Francisco Martinez-Gil ... Dolors Serra
Mathematics | VOL. 8
Francisco Martinez-Gil, et. al.Francisco Martinez-Gil ... Dolors Serra
02 Sep 2020
Mathematics | VOL. 8

Inverse reinforcement learning using Dynamic Policy Programming
Eiji Uchibe ... Kenji Doya
-
Eiji Uchibe, et. al.Eiji Uchibe ... Kenji Doya
01 Oct 2014
01 Oct 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The actions of others act as a pseudo-reward to drive imitation in the context of social reinforcement learning

Abstract

Talk to us

Similar Papers