Abstract

Learning to form appropriate, task-relevant working memory representations is a complex process central to cognition. Gating models frame working memory as a collection of past observations and use reinforcement learning (RL) to solve the problem of when to update these observations. Investigation of how gating models relate to brain and behavior remains, however, at an early stage. The current study sought to explore the ability of simple RL gating models to replicate rule learning behavior in rats. Rats were trained in a maze-based spatial learning task that required animals to make trial-by-trial choices contingent upon their previous experience. Using an abstract version of this task, we tested the ability of two gating algorithms, one based on the Actor-Critic and the other on the State-Action-Reward-State-Action (SARSA) algorithm, to generate behavior consistent with the rats'. Both models produced rule-acquisition behavior consistent with the experimental data, though only the SARSA gating model mirrored faster learning following rule reversal. We also found that both gating models learned multiple strategies in solving the initial task, a property which highlights the multi-agent nature of such models and which is of importance in considering the neural basis of individual differences in behavior.

Highlights

  • Working memory involves the short-term maintenance of taskrelevant information and is essential in the successful guidance of many behaviors

  • We show that both gating models produce behavior consistent with initial rule-acquisition by the animals but differ in their abilities to replicate faster learning following rule reversal

  • RULE ACQUISITION IS REPRODUCED IN GATING MODELS Two reinforcement learning (RL) gating algorithms, one based on Actor-Critic methods (Barto et al, 1983) and the other on SARSA (Rummery and Niranjan, 1994), were given an abstract version of the rulelearning maze task (Figure 1B) and parameters fit to a subset of the rat behavioral data

Read more

Summary

Introduction

Working memory involves the short-term maintenance of taskrelevant information and is essential in the successful guidance of many behaviors (for review see Baddeley, 2012). The aim of the current work was to investigate in detail the ability of gating models to match behavioral data by comparing the behavior of two RL gating models with the learning pattern of rats in a rule learning task. We show that both gating models produce behavior consistent with initial rule-acquisition by the animals but differ in their abilities to replicate faster learning following rule reversal. We highlight the ability of both gating models to converge on multiple strategies and relate this property to multi-agent RL (MARL) systems in general

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.