Unconscious reinforcement learning of hidden brain states supported by confidence

Aurelio Cortese,Mitsuo Kawato,Hakwan Lau

doi:10.1038/s41467-020-17828-8

Abstract

Can humans be trained to make strategic use of latent representations in their own brains? We investigate how human subjects can derive reward-maximizing choices from intrinsic high-dimensional information represented stochastically in neural activity. Reward contingencies are defined in real-time by fMRI multivoxel patterns; optimal action policies thereby depend on multidimensional brain activity taking place below the threshold of consciousness, by design. We find that subjects can solve the task within two hundred trials and errors, as their reinforcement learning processes interact with metacognitive functions (quantified as the meaningfulness of their decision confidence). Computational modelling and multivariate analyses identify a frontostriatal neural mechanism by which the brain may untangle the ‘curse of dimensionality’: synchronization of confidence representations in prefrontal cortex with reward prediction errors in basal ganglia support exploration of latent task representations. These results may provide an alternative starting point for future investigations into unconscious learning and functions of metacognition.

Highlights

Can humans be trained to make strategic use of latent representations in their own brains? We investigate how human subjects can derive reward-maximizing choices from intrinsic high-dimensional information represented stochastically in neural activity
Empirical optimal action-selection rates were compared to a chance level of 0.5, the rate attained if actions were randomly selected at every trial
Two main questions were addressed in this study: Can human subjects learn to make use of latent, high-dimensional brain activity? What is the putative vehicle and neural substrate of this ability? The closed-loop design adopted here granted a unique opportunity to investigate the ability of the human brain to learn to use unconscious, high-dimensional internal representations

Summary

Introduction

Can humans be trained to make strategic use of latent representations in their own brains? We investigate how human subjects can derive reward-maximizing choices from intrinsic high-dimensional information represented stochastically in neural activity. Computational modelling and multivariate analyses identify a frontostriatal neural mechanism by which the brain may untangle the ‘curse of dimensionality’: synchronization of confidence representations in prefrontal cortex with reward prediction errors in basal ganglia support exploration of latent task representations These results may provide an alternative starting point for future investigations into unconscious learning and functions of metacognition. Because the time of decoding is pre-stimulus and the ensuing stimulus itself carries no direction information, the decoder alone defines the latent state from stochastic brain activity, along a predetermined classification boundary Such multidimensional patterns are known to represent information that is generally below consciousness[1,16,17,18,19]

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Nature Communications	Publication Date: Aug 31, 2020
Citations: 32	License type: open-access

R Discovery Prime

R Discovery Prime

Unconscious reinforcement learning of hidden brain states supported by confidence

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nature Communications

Lead the way for us

Similar Papers

How thoughts arise from sights: inferotemporal and prefrontal contributions to vision
Simon Kornblith ... Doris Y Tsao
Current Opinion in Neurobiology | VOL. 46
Simon Kornblith, et. al.Simon Kornblith ... Doris Y Tsao
22 Sep 2017
Current Opinion in Neurobiology | VOL. 46

Choice-selective sequences dominate in cortical relative to thalamic inputs to NAc to support reinforcement learning.
Nathan F Parker ... Laura M Haetzel
Cell Reports | VOL. 39
Nathan F Parker, et. al.Nathan F Parker ... Laura M Haetzel
01 May 2022
Cell Reports | VOL. 39

Beta oscillations in monkey striatum encode reward prediction error signals.
Ruggero Basanisi ... Paul Apicella
The Journal of Neuroscience | VOL. 43
Ruggero Basanisi, et. al.Ruggero Basanisi ... Paul Apicella
04 Apr 2023
The Journal of Neuroscience | VOL. 43

The hypothetical cost-conflict monitor: is it a possible trigger for conflict-driven control mechanisms in the human brain?
Sareh Zendehrouh ... Shahriar Gharibzadeh
Frontiers in Computational Neuroscience | VOL. 8
Sareh Zendehrouh, et. al.Sareh Zendehrouh ... Shahriar Gharibzadeh
21 Jul 2014
Frontiers in Computational Neuroscience | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Unconscious reinforcement learning of hidden brain states supported by confidence

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nature Communications