Abstract

In reinforcement learning, adaptive behavior depends on the ability to predict future outcomes based on previous decisions. The Reward Positivity (RewP) is thought to encode reward prediction errors in the anterior midcingulate cortex (aMCC) whenever these predictions are violated. Although the RewP has been extensively studied in the context of simple binary (win vs. loss) reward processing, recent studies suggest that the RewP scales complex feedback in a fine graded fashion. The aim of this study was to replicate and extend previous findings that the RewP reflects the integrated sum of instantaneous and delayed consequences of a singular outcome by increasing the feedback information content by a third temporal dimension. We used a complex reinforcement-learning task where each option was associated with an immediate, intermediate and delayed monetary outcome and analyzed the RewP in the time domain as well as fronto-medial theta power in the time-frequency domain. To test if the RewP sensitivity to the three outcome dimensions reflect stable trait-like individual differences in reward processing, a retesting session took place 3months later. The results confirm that the RewP reflects the integrated value of complex temporally extended consequences in a stable manner, albeit there was no relation to behavioral choice. Our findings indicate that the medial frontal cortex receives fine graded information about complex action outcomes that, however, may not necessarily translate to cognitive or behavioral control processes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call