SpiNNaker is a digital neuromorphic architecture, designed specifically for the low power simulation of large-scale spiking neural networks at speeds close to biological real-time. Unlike other neuromorphic systems, SpiNNaker allows users to develop their own neuron and synapse models as well as specify arbitrary connectivity. As a result SpiNNaker has proved to be a powerful tool for studying different neuron models as well as synaptic plasticity—believed to be one of the main mechanisms behind learning and memory in the brain. A number of Spike-Timing-Dependent-Plasticity(STDP) rules have already been implemented on SpiNNaker and have been shown to be capable of solving various learning tasks in real-time. However, while STDP is an important biological theory of learning, it is a form of Hebbian or unsupervised learning and therefore does not explain behaviors that depend on feedback from the environment. Instead, learning rules based on neuromodulated STDP (three-factor learning rules) have been shown to be capable of solving reinforcement learning tasks in a biologically plausible manner. In this paper we demonstrate for the first time how a model of three-factor STDP, with the third-factor representing spikes from dopaminergic neurons, can be implemented on the SpiNNaker neuromorphic system. Using this learning rule we first show how reward and punishment signals can be delivered to a single synapse before going on to demonstrate it in a larger network which solves the credit assignment problem in a Pavlovian conditioning experiment. Because of its extra complexity, we find that our three-factor learning rule requires approximately 2× as much processing time as the existing SpiNNaker STDP learning rules. However, we show that it is still possible to run our Pavlovian conditioning model with up to 1 × 104 neurons in real-time, opening up new research opportunities for modeling behavioral learning on SpiNNaker.
Read full abstract