Exploring the limits of learning: Segregation of information integration and response selection is required for learning a serial reversal task.

Camilo Juan Mininni,B Silvano Zanutto

doi:10.1371/journal.pone.0186959

Abstract

Animals are proposed to learn the latent rules governing their environment in order to maximize their chances of survival. However, rules may change without notice, forcing animals to keep a memory of which one is currently at work. Rule switching can lead to situations in which the same stimulus/response pairing is positively and negatively rewarded in the long run, depending on variables that are not accessible to the animal. This fact raises questions on how neural systems are capable of reinforcement learning in environments where the reinforcement is inconsistent. Here we address this issue by asking about which aspects of connectivity, neural excitability and synaptic plasticity are key for a very general, stochastic spiking neural network model to solve a task in which rules change without being cued, taking the serial reversal task (SRT) as paradigm. Contrary to what could be expected, we found strong limitations for biologically plausible networks to solve the SRT. Especially, we proved that no network of neurons can learn a SRT if it is a single neural population that integrates stimuli information and at the same time is responsible of choosing the behavioural response. This limitation is independent of the number of neurons, neuronal dynamics or plasticity rules, and arises from the fact that plasticity is locally computed at each synapse, and that synaptic changes and neuronal activity are mutually dependent processes. We propose and characterize a spiking neural network model that solves the SRT, which relies on separating the functions of stimuli integration and response selection. The model suggests that experimental efforts to understand neural function should focus on the characterization of neural circuits according to their connectivity, neural dynamics, and the degree of modulation of synaptic plasticity with reward.

Highlights

Natural environments are complex places in which animals strive to survive, with hidden variables and stochastic factors such that the information available at any moment is partial, and it must be sampled at several time points and integrated
We will study the characteristics of an agent controlled by a biologically plausible neural network that learns to solve a Serial Reversal Task (SRT), conforming to what we will define as the hypothesis of functionality by learning, which states that the set of configurations that gives functionality is a small subset of the set of initial configurations
Global performance starts around 50% at t = 0 ms, which means that the s(T − 1), R(T − 1) and r(T − 1) components were already codified at trial initiation; uncertainty remained regarding s(T), which is expected since this stimulus had not been presented at t = 0 ms

Summary

Introduction

Natural environments are complex places in which animals strive to survive, with hidden variables and stochastic factors such that the information available at any moment is partial, and it must be sampled at several time points and integrated. An animal might learn how and where to seek for food, but if the place for feeding cyclically changes, or the means of obtaining food change, the animal has to switch strategies along [1,2]. In this case, no unique strategies exist, but several strategies must be learned. Learning an SRT through a neural network model can be problematic: since each stimulus/response pairing is positively and negatively reinforced in the long run, learning of one rule may lead to the erasure of information regarding other rules, conforming a case of catastrophic forgetting [5]. Brain regions like the prefrontal cortex [6,7] and the striatum [8,9] have been found necessary for learning the SRT, the precise neural mechanisms involved are not well understood

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PloS one	Publication Date: Oct 27, 2017
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Exploring the limits of learning: Segregation of information integration and response selection is required for learning a serial reversal task.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one

Lead the way for us

Similar Papers

Author response: Evolving interpretable plasticity for spiking networks
Jakob Jordan ... Maximilian Schmidt
-
Jakob Jordan, et. al.Jakob Jordan ... Maximilian Schmidt
15 Jul 2021
15 Jul 2021

Cortical Motion Perception Emerges from Dimensionality Reduction with Evolved Spike-Timing-Dependent Plasticity Rules.
Kexin Chen ... Michael Beyeler
The Journal of Neuroscience | VOL. 42
Kexin Chen, et. al.Kexin Chen ... Michael Beyeler
22 Jun 2022
The Journal of Neuroscience | VOL. 42

Robustness of Learning That Is Based on Covariance-Driven Synaptic Plasticity
Yonatan Loewenstein
PLoS Computational Biology | VOL. 4
Yonatan LoewensteinYonatan Loewenstein
07 Mar 2008
PLoS Computational Biology | VOL. 4

A calcium-dependent plasticity rule for HCN channels maintains activity homeostasis and stable synaptic learning.
Suraj Honnuraiah ... Rishikesh Narayanan
PLoS ONE | VOL. 8
Suraj Honnuraiah, et. al.Suraj Honnuraiah ... Rishikesh Narayanan
04 Feb 2013
PLoS ONE | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Exploring the limits of learning: Segregation of information integration and response selection is required for learning a serial reversal task.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one