Stable Learning Research Articles

Neuronal systems that are involved in reinforcement learning must solve the temporal credit assignment problem, i.e., how is a stimulus associated with a reward that is delayed in time? Theoretical studies [1-3] have postulated that neural activity underlying learning ‘tags’ synapses with an ‘eligibility trace’, and that the subsequent arrival of a reward converts the eligibility traces into actual modification of synaptic efficacies. While eligibility traces provide one simple solution to the temporal credit assignment problem, they alone do not constitute a stable learning rule because there is no other mechanism indicating when learning should cease. In order to attain stability, rules involving eligibility traces often assume that once the association is learned, further learning is prevented via an inhibition of the reward stimulus [1,3,4]. Although synaptic plasticity is responsible for reinforcement learning in the brain, theories of reinforcement learning are generally abstract and involve neither neurons nor synapses. Furthermore, biophysical theories of synaptic plasticity typically model unsupervised learning and ignore the contribution of reinforcement. Here we describe a biophysically based theory of reinforcementmodulated synaptic plasticity and postulate the existence of two eligibility traces with different temporal profiles: one corresponding to the induction of LTP, and the other to the induction of LTD. The traces have different kinetics and their difference in magnitude at the time of reward determines if synaptic modification will correspond to LTP or LTD. Due to the difference in their decay rates, the LTP and LTD traces can exhibit temporal competition at the reward time and thus provides a mechanism for stable reinforcement learning without the need to inhibit reward. We test this novel reinforcement-learning rule on an experimentally motivated model of a recurrent cortical network [5], and compare the model results to experimental results at both the cellular and circuit levels. We further suggest that these eligibility traces are implemented via kinases and phosphatases, thus accounting for results at both the cellular and system levels.

Read full abstract

Many cognitive and motor functions are enabled by the temporal representation and processing of stimuli but it remains an open issue how neuronal circuits could reliably encode such sequences of information. We consider the task of generating and learning spatiotemporal spike patterns in the context of an attractor memory network, in which each memory is stored in a distributed fashion represented by increased firing in pools of excitatory neurons. Excitatory activity is locally modulated by inhibitory neurons representing lateral inhibition that generates a type of winner-take-all dynamics. Networks of this type have previously been shown to exhibit switching between a non-coding ground state and low-rate memory state activations displaying gamma oscillations [1]; however, stable sequential associations between different attractors were not present. Assuming a probabilistic framework in which local neuron populations discretely encode uncertainty about an attribute in the external world (e.g. a column in visual cortex tuned to a specific edge orientation), we model inter-module synapses using the Bayesian Confidence Propagation Neural Network (BCPNN) plasticity rule [2]. We use a spike-based version of BCPNN in which synaptic weights are statistically inferred by estimating the posterior likelihood of activation for the postsynaptic cell upon presentation of evidence in the form of presynaptic activity patterns. Probabilities are estimated on-line using local exponentially weighted moving averages, with time scales that are biologically motivated by the cascade of events involved in the induction and maintenance of long-term plasticity. Modulating the kinetics of these traces is shown to shape the width of the STDP kernel, which in turn allows attractors to be learned forwards or backwards through time. Stable learning is confirmed by aunimodal stationary weight distribution. Inference additionally requires modification of a distinct neuronal component, which we interpret as a correlate of intrinsic excitability. Such synaptic [3] and nonsynaptic [4] mechanisms were specifically shown to be relevant for learning and inference. In broader terms, our model instead suggests the presence of and interaction between all of these processes in approximating Bayesian computation. Introducing plastic BCPNN synaptic projections into the attractor network model allows for stable associations between distinct network states. Associations are mediated by different synaptic timescales [5] with fast (AMPA type) and slower (NMDA type) dynamics that in conjunction with the spiking BCPNN rule produce sequences of attractor activations. We demonstrate the feasibility of our model using network simulations of integrate-and-fire neurons, and find that the ability to learn sequences depends on the specific structure of the inhibitory microcircuitry and on the local balance of excitation and inhibition in the network. Preliminary results show that the network can reliably store spatiotemporal patterns consisting of hundreds of discrete network states using just a few thousand neurons. Moreover, excitatory pools can participate multiple times in the sequence, suggesting that spiking attractor networks of this type could support an efficient combinatorial code. Our model provides novel insights into how local and global computations found throughout neocortex and hippocampus, framed in the context of probabilistic inference, could contribute to generating and learning sequential neural activity.

Read full abstract

Stable Learning Research Articles

Related Topics

Articles published on Stable Learning

Stable reinforcement learning via temporal competition between LTP and LTD traces

Décompression du canal carpien par chirurgie ultra-mini-invasive sous guidage échographique : une étude pilote externe

Ultra minimally invasive sonographically guided carpal tunnel release: An external pilot study

Coexistence of Reward and Unsupervised Learning During the Operant Conditioning of Neural Firing Rates

Multi-model adaptive control based on fuzzy neural networks

Trade-off between learning and exploitation: The Pareto-optimal versus evolutionarily stable learning schedule in cumulative cultural evolution

Neural Network Model Reference Adaptive System Speed Estimation for Sensorless Control of a Doubly Fed Induction Generator

A Novel Wavelet-Neural-Network-Based Robust Controller for IPM Motor Drives

Cognitive Strategies of Encoding, Storage, and Retrieval of Lexicon Popular Techniques Applied by Iranian French Language Learners

An Improved Reinforcement Learning System Using Affective Factors

Probabilistic computation underlying sequence learning in a spiking attractor memory network

ART2 알고리즘을 이용한 효율적인 스마트폰 어플리케이션 실행 방법

Robust on-line nonlinear systems identification using multilayer dynamic neural networks with two-time scales

ON OPTIMAL LEARNING SCHEDULES AND THE MARGINAL VALUE OF CUMULATIVE CULTURAL EVOLUTION

Stable learning of functional maps in self-organizing spiking neural networks with continuous synaptic plasticity

On-line Adaptive Interval Type-2 Fuzzy Controller Design via Stable SPSA Learning Mechanism

Modeling Component Concentrations of Sodium Aluminate Solution Via Hammerstein Recurrent Neural Networks

Two Types of Haar Wavelet Neural Networks for Nonlinear System Identification

Depression-Biased Reverse Plasticity Rule Is Required for Stable Learning at Top-Down Connections

Evolutionarily stable learning schedules and cumulative culture in discrete generation models

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Stable Learning Research Articles

Related Topics

Articles published on Stable Learning

Stable reinforcement learning via temporal competition between LTP and LTD traces

Décompression du canal carpien par chirurgie ultra-mini-invasive sous guidage échographique : une étude pilote externe

Ultra minimally invasive sonographically guided carpal tunnel release: An external pilot study

Coexistence of Reward and Unsupervised Learning During the Operant Conditioning of Neural Firing Rates

Multi-model adaptive control based on fuzzy neural networks

Trade-off between learning and exploitation: The Pareto-optimal versus evolutionarily stable learning schedule in cumulative cultural evolution

Neural Network Model Reference Adaptive System Speed Estimation for Sensorless Control of a Doubly Fed Induction Generator

A Novel Wavelet-Neural-Network-Based Robust Controller for IPM Motor Drives

Cognitive Strategies of Encoding, Storage, and Retrieval of Lexicon Popular Techniques Applied by Iranian French Language Learners

An Improved Reinforcement Learning System Using Affective Factors

Probabilistic computation underlying sequence learning in a spiking attractor memory network

ART2 알고리즘을 이용한 효율적인 스마트폰 어플리케이션 실행 방법

Robust on-line nonlinear systems identification using multilayer dynamic neural networks with two-time scales

ON OPTIMAL LEARNING SCHEDULES AND THE MARGINAL VALUE OF CUMULATIVE CULTURAL EVOLUTION

Stable learning of functional maps in self-organizing spiking neural networks with continuous synaptic plasticity

On-line Adaptive Interval Type-2 Fuzzy Controller Design via Stable SPSA Learning Mechanism

Modeling Component Concentrations of Sodium Aluminate Solution Via Hammerstein Recurrent Neural Networks

Two Types of Haar Wavelet Neural Networks for Nonlinear System Identification

Depression-Biased Reverse Plasticity Rule Is Required for Stable Learning at Top-Down Connections

Evolutionarily stable learning schedules and cumulative culture in discrete generation models