Dynamic reinforcement learning reveals time-dependent shifts in strategy during reward learning.

Sarah Jo C Venditto,Nathaniel D Daw,Carlos D Brody,Kevin J Miller

doi:10.1101/2024.02.28.582617

Abstract

Different brain systems have been hypothesized to subserve multiple "experts" that compete to generate behavior. In reinforcement learning, two general processes, one model-free (MF) and one model-based (MB), are often modeled as a mixture of agents (MoA) and hypothesized to capture differences between automaticity vs. deliberation. However, shifts in strategy cannot be captured by a static MoA. To investigate such dynamics, we present the mixture-of-agents hidden Markov model (MoA-HMM), which simultaneously learns inferred action values from a set of agents and the temporal dynamics of underlying "hidden" states that capture shifts in agent contributions over time. Applying this model to a multi-step, reward-guided task in rats reveals a progression of within-session strategies: a shift from initial MB exploration to MB exploitation, and finally to reduced engagement. The inferred states predict changes in both response time and OFC neural encoding during the task, suggesting that these states are capturing real shifts in dynamics.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Dynamic reinforcement learning reveals time-dependent shifts in strategy during reward learning.

Abstract

Talk to us

Similar Papers

More From: bioRxiv : the preprint server for biology

Lead the way for us

Similar Papers

P242. Anxiety Associated with Perceived Lack of Control Over Stress During the COVID-19 Pandemic Impairs Reward Learning
Marc Guitart-Masip ... Andreas Olsson
Biological Psychiatry | VOL. 91
Marc Guitart-Masip, et. al.Marc Guitart-Masip ... Andreas Olsson
28 Apr 2022
Biological Psychiatry | VOL. 91

Flexibility to contingency changes distinguishes habitual and goal-directed strategies in humans.
Julie J Lee ... Mehdi Keramati
PLOS Computational Biology | VOL. 13
Julie J Lee, et. al.Julie J Lee ... Mehdi Keramati
28 Sep 2017
PLOS Computational Biology | VOL. 13

EEG-based classification of learning strategies : Model-based and model-free reinforcement learning
Dongjae Kim ... Sang Wan Lee
-
Dongjae Kim, et. al.Dongjae Kim ... Sang Wan Lee
01 Jan 2018
01 Jan 2018

Human subjects exploit a cognitive map for credit assignment
Rani Moran ... Raymond J Dolan
Proceedings of the National Academy of Sciences | VOL. 118
Rani Moran, et. al.Rani Moran ... Raymond J Dolan
21 Jan 2021
Proceedings of the National Academy of Sciences | VOL. 118

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Dynamic reinforcement learning reveals time-dependent shifts in strategy during reward learning.

Abstract

Talk to us

Similar Papers

More From: bioRxiv : the preprint server for biology