Abstract

Functional magnetic resonance imaging (fMRI), cyclic voltammetry, and single-unit electrophysiology studies suggest that signals measured in the nucleus accumbens (Nacc) during value-based decision making represent reward prediction errors (RPEs), the difference between actual and predicted rewards. Here, we studied the precise temporal and spectral pattern of reward-related signals in the human Nacc. We recorded local field potentials (LFPs) from the Nacc of six epilepsy patients during an economic decision-making task. On each trial, patients decided whether to accept or reject a gamble with equal probabilities of a monetary gain or loss. The behavior of four patients was consistent with choices being guided by value expectations. Expected value signals before outcome onset were observed in three of those patients, at varying latencies and with nonoverlapping spectral patterns. Signals after outcome onset were correlated with RPE regressors in all subjects. However, further analysis revealed that these signals were better explained as outcome valence rather than RPE signals, with gamble gains and losses differing in the power of beta oscillations and in evoked response amplitudes. Taken together, our results do not support the idea that postsynaptic potentials in the Nacc represent a RPE that unifies outcome magnitude and prior value expectation. We discuss the generalizability of our findings to healthy individuals and the relation of our results to measurements of RPE signals obtained from the Nacc with other methods.

Highlights

  • REINFORCEMENT LEARNING IS thought to rely on reward prediction errors (RPEs), the difference between experienced and expected rewards

  • Our primary goal was to test whether local field potentials (LFPs) from the human nucleus accumbens (Nacc) contain signals that are compatible with representing expected value and RPE signals

  • We tested whether LFPs in the human Nacc contain signals that are compatible with a RPE

Read more

Summary

Introduction

REINFORCEMENT LEARNING IS thought to rely on reward prediction errors (RPEs), the difference between experienced and expected rewards. Latencies and durations of reward-related signals measured by human functional magnetic resonance imaging (fMRI) or by voltammetry are on the order of seconds (e.g., Hart et al 2014; Pessiglione et al 2006). This contrasts with the rapid and transient modulation of neuronal firing rate in dopaminergic midbrain neurons, which occurs within a few hundred milliseconds (Schultz et al 1997). To confirm that patients’ behavior was consistent with forming a reward expectation that influenced their choices, behavior in the task was fitted by standard parametric decision models

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.