Which Temporal Difference Learning Algorithm Best Reproduces Dopamine Activity in a Multi-choice Task?

Jean Bellot,Mehdi Khamassi,Olivier Sigaud

doi:10.1007/978-3-642-33093-3_29

Abstract

AbstractThe activity of dopaminergic (DA) neurons has been hypothesized to encode a reward prediction error (RPE) which corresponds to the error signal in Temporal Difference (TD) learning algorithms. This hypothesis has been reinforced by numerous studies showing the relevance of TD learning algorithms to describe the role of basal ganglia in classical conditioning. However, recent recordings of DA neurons during multi-choice tasks raised contradictory interpretations on whether DA’s RPE signal is action dependent or not. Thus the precise TD algorithm (i.e. Actor-Critic, Q-learning or SARSA) that best describes DA signals remains unknown. Here we simulate and precisely analyze these TD algorithms on a multi-choice task performed by rats. We find that DA activity previously reported in this task is best fitted by a TD error which has not fully converged, and which converged faster than observed behavioral adaptation.Keywordsdopaminereinforcement learningreward prediction errorbehavioral adaptationinstrumental conditioning

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Which Temporal Difference Learning Algorithm Best Reproduces Dopamine Activity in a Multi-choice Task?

Abstract

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2012
Citations: 4	License type: cc-by

Similar Papers

Which Temporal Difference learning algorithm best reproduces dopamine activity in a multi-choice task?
Jean Bellot ... Mehdi Khamassi
BMC Neuroscience | VOL. 14
Jean Bellot, et. al.Jean Bellot ... Mehdi Khamassi
01 Jul 2013
BMC Neuroscience | VOL. 14

Temporal difference learning applied to sequential detection
Chengan Guo ... A Kuh
IEEE Transactions on Neural Networks | VOL. 8
Chengan Guo, et. al. Chengan Guo ... A Kuh
01 Mar 1997
IEEE Transactions on Neural Networks | VOL. 8

Ventral midbrain dopaminergic neurons: From neurogenesis to neurodegeneration
Eric J Huang
FEBS Letters | VOL. 589
Eric J HuangEric J Huang
12 Nov 2015
FEBS Letters | VOL. 589

A gradual temporal shift of dopamine responses mirrors the progression of temporal difference error in machine learning.
Ryunosuke Amo ... Kenji F Tanaka
Nature neuroscience | VOL. 25
Ryunosuke Amo, et. al.Ryunosuke Amo ... Kenji F Tanaka
07 Jul 2022
Nature neuroscience | VOL. 25

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Which Temporal Difference Learning Algorithm Best Reproduces Dopamine Activity in a Multi-choice Task?

Abstract

Talk to us

Similar Papers