A scalable hybrid training approach for recurrent spiking neural networks
This paper introduces HYPR, a scalable hybrid training method for recurrent spiking neural networks that combines parallelized approximate online forward learning with constant memory demands, enabling high-throughput training on neuromorphic hardware and achieving near-BPTT performance, especially with oscillatory neuron models.
Abstract Recurrent spiking neural networks (RSNNs) can be implemented very efficiently in neuromorphic systems. Nevertheless, training of these models with powerful gradient-based learning algorithms is mostly performed on standard digital hardware using Backpropagation through time (BPTT). However, BPTT has substantial limitations. It does not permit online training and its memory consumption scales linearly with the number of computation steps. In contrast, learning methods using forward propagation of gradients operate in an online manner with a memory consumption independent of the number of time steps. These methods enable SNNs to learn from continuous, infinite-length input sequences. In addition, approximate forward propagation algorithms have been developed that can be implemented on neuromorphic hardware. Yet, slow execution speed on conventional hardware as well as inferior performance has hindered their widespread application. In this work, we introduce HYbrid PRopagation (HYPR) that combines the efficiency of parallelization with approximate online forward learning. Our algorithm yields high-throughput online learning through parallelization, paired with constant, i.e., sequence length independent, memory demands. HYPR enables parallelization of parameter update computation over subsequences for RSNNs consisting of almost arbitrary non-linear spiking neuron models. We apply HYPR to networks of spiking neurons with oscillatory subthreshold dynamics. We find that this type of neuron model is particularly well trainable by HYPR, resulting in an unprecedentedly low task performance gap between approximate forward gradient learning and BPTT.
- Research Article
27
- 10.3389/fnins.2022.1018006
- Nov 28, 2022
- Frontiers in Neuroscience
In recent years, the application of deep learning models at the edge has gained attention. Typically, artificial neural networks (ANNs) are trained on graphics processing units (GPUs) and optimized for efficient execution on edge devices. Training ANNs directly at the edge is the next step with many applications such as the adaptation of models to specific situations like changes in environmental settings or optimization for individuals, e.g., optimization for speakers for speech processing. Also, local training can preserve privacy. Over the last few years, many algorithms have been developed to reduce memory footprint and computation. A specific challenge to train recurrent neural networks (RNNs) for processing sequential data is the need for the Back Propagation Through Time (BPTT) algorithm to store the network state of all time steps. This limitation is resolved by the biologically-inspired E-prop approach for training Spiking Recurrent Neural Networks (SRNNs). We implement the E-prop algorithm on a prototype of the SpiNNaker 2 neuromorphic system. A parallelization strategy is developed to split and train networks on the ARM cores of SpiNNaker 2 to make efficient use of both memory and compute resources. We trained an SRNN from scratch on SpiNNaker 2 in real-time on the Google Speech Command dataset for keyword spotting. We achieved an accuracy of 91.12% while requiring only 680 KB of memory for training the network with 25 K weights. Compared to other spiking neural networks with equal or better accuracy, our work is significantly more memory-efficient. In addition, we performed a memory and time profiling of the E-prop algorithm. This is used on the one hand to discuss whether E-prop or BPTT is better suited for training a model at the edge and on the other hand to explore architecture modifications to SpiNNaker 2 to speed up online learning. Finally, energy estimations predict that the SRNN can be trained on SpiNNaker2 with 12 times less energy than using a NVIDIA V100 GPU.
- Conference Article
8
- 10.1109/aicas54282.2022.9869963
- Jun 13, 2022
Recurrent spiking neural networks (SNNs) are inspired by the working principles of biological nervous systems that offer unique temporal dynamics and event-based processing. Recently, the error backpropagation through time (BPTT) algorithm has been successfully employed to train SNNs offline, with comparable performance to artificial neural networks (ANNs) on complex tasks. However, BPTT has severe limitations for online learning scenarios of SNNs where the network is required to simultaneously process and learn from incoming data. Specifically, as BPTT separates the inference and update phases, it would require to store all neuronal states for calculating the weight updates backwards in time. To address these fundamental issues, alternative credit assignment schemes are required. Within this context, neuromorphic hardware (NMHW) implementations of SNNs can greatly benefit from in-memory computing (IMC) concepts that follow the brain-inspired collocation of memory and processing, further enhancing their energy efficiency. In this work, we utilize a biologically-inspired local and online training algorithm compatible with IMC, which approximates BPTT, e-prop, and present an approach to support both inference and training of a recurrent SNN using NMHW. To do so, we embed the SNN weights on an in-memory computing NMHW with phase-change memory (PCM) devices and integrate it into a hardware-in-the-loop training setup. We develop our approach with respect to limited precision and imperfections of the analog devices using a PCM-based simulation framework and a NMHW consisting of in-memory computing cores fabricated in 14nm CMOS technology with 256×256 PCM crossbar arrays. We demonstrate that our approach is robust even to 4-bit precision and achieves competitive performance to a floating-point 32-bit realization, while simultaneously equipping the SNN with online training capabilities and exploiting the acceleration benefits of NMHW.
- Research Article
52
- 10.3389/fnins.2022.951164
- Nov 11, 2022
- Frontiers in Neuroscience
Spatio-temporal pattern recognition is a fundamental ability of the brain which is required for numerous real-world activities. Recent deep learning approaches have reached outstanding accuracies in such tasks, but their implementation on conventional embedded solutions is still very computationally and energy expensive. Tactile sensing in robotic applications is a representative example where real-time processing and energy efficiency are required. Following a brain-inspired computing approach, we propose a new benchmark for spatio-temporal tactile pattern recognition at the edge through Braille letter reading. We recorded a new Braille letters dataset based on the capacitive tactile sensors of the iCub robot's fingertip. We then investigated the importance of spatial and temporal information as well as the impact of event-based encoding on spike-based computation. Afterward, we trained and compared feedforward and recurrent Spiking Neural Networks (SNNs) offline using Backpropagation Through Time (BPTT) with surrogate gradients, then we deployed them on the Intel Loihi neuromorphic chip for fast and efficient inference. We compared our approach to standard classifiers, in particular to the Long Short-Term Memory (LSTM) deployed on the embedded NVIDIA Jetson GPU, in terms of classification accuracy, power, and energy consumption together with computational delay. Our results show that the LSTM reaches ~97% of accuracy, outperforming the recurrent SNN by ~17% when using continuous frame-based data instead of event-based inputs. However, the recurrent SNN on Loihi with event-based inputs is ~500 times more energy-efficient than the LSTM on Jetson, requiring a total power of only ~30 mW. This work proposes a new benchmark for tactile sensing and highlights the challenges and opportunities of event-based encoding, neuromorphic hardware, and spike-based computing for spatio-temporal pattern recognition at the edge.
- Book Chapter
2
- 10.1007/978-3-030-86383-8_19
- Jan 1, 2021
In this paper, we demonstrate that goal-directed behavior unfolds in recurrent spiking neural networks (RSNNs) when intentions are projected onto continuously progressing spike dynamics encoding the recent history of an agent’s state. The projections, which can either be realized via backpropagation through time (BPTT) over a certain time window or even directly and temporally local in an online fashion using a biologically inspired inference rule. In contrast to previous studies that use, for instance, LSTM-like models, our approach is biologically more plausible as it fully relies on spike-based processing of sensorimotor experiences. Specifically, we show that precise control of a flying vehicle in a 3D environment is possible. Moreover, we show that more complex mental traces of foresighted movement imagination unfold that effectively help to circumvent learned obstacles.
- Research Article
5
- 10.1007/s00521-010-0506-1
- Dec 31, 2010
- Neural Computing and Applications
This paper proposes a new hybrid approach for recurrent neural networks (RNN). The basic idea of this approach is to train an input layer by unsupervised learning and an output layer by supervised learning. In this method, the Kohonen algorithm is used for unsupervised learning, and dynamic gradient descent method is used for supervised learning. The performances of the proposed algorithm are compared with backpropagation through time (BPTT) on three benchmark problems. Simulation results show that the performances of the new proposed algorithm exceed the standard backpropagation through time in the reduction of the total number of iterations and in the learning time required in the training process.
- Research Article
48
- 10.1109/tnnls.2022.3153985
- Nov 1, 2023
- IEEE Transactions on Neural Networks and Learning Systems
Biological neural networks are equipped with an inherent capability to continuously adapt through online learning. This aspect remains in stark contrast to learning with error backpropagation through time (BPTT) that involves offline computation of the gradients due to the need to unroll the network through time. Here, we present an alternative online learning algorithm ic framework for deep recurrent neural networks (RNNs) and spiking neural networks (SNNs), called online spatio-temporal learning (OSTL). It is based on insights from biology and proposes the clear separation of spatial and temporal gradient components. For shallow SNNs, OSTL is gradient equivalent to BPTT enabling for the first time online training of SNNs with BPTT-equivalent gradients. In addition, the proposed formulation unveils a class of SNN architectures trainable online at low time complexity. Moreover, we extend OSTL to a generic form, applicable to a wide range of network architectures, including networks comprising long short-term memory (LSTM) and gated recurrent units (GRUs). We demonstrate the operation of our algorithm ic framework on various tasks from language modeling to speech recognition and obtain results on par with the BPTT baselines.
- Book Chapter
9
- 10.1007/978-3-030-30487-4_38
- Jan 1, 2019
Learning compositional dynamics with recurrent neural networks (RNNs) trained with back-propagation through time (BPTT) is usually a difficult task. Typically RNNs learn the consecutive shape along target sequences from time step to time step, focusing on local temporal correlations. When the challenge is to identify and model independent, unknown data subcomponents, that is, data generating causes on-the-fly during training, however, this local temporal shape-oriented inductive learning bias is obstructive. We propose a modular, compositional RNN architecture and derive simple procedures to automatically infer the source subdynamics that generate the data. We show that the involved error signal separation can be used for both teacher forcing and model-distinct target signal provision in the compositional RNN architecture. As a result, the entire network is able to learn compositional dynamics, developing emergent, flexibly adaptable signal decompositions within the distributed modules. We demonstrate that in this way simple RNNs trained with BPTT can learn sequences that could so far only be solved effectively with reservoir computing approaches. Moreover we show that these RNNs are much more robust against signal noise when compared to traditional BPTT or reservoir computing approaches.
- Research Article
9
- 10.3389/fnins.2024.1439155
- Jul 10, 2024
- Frontiers in Neuroscience
Recurrent neural networks (RNNs) hold immense potential for computations due to their Turing completeness and sequential processing capabilities, yet existing methods for their training encounter efficiency challenges. Backpropagation through time (BPTT), the prevailing method, extends the backpropagation (BP) algorithm by unrolling the RNN over time. However, this approach suffers from significant drawbacks, including the need to interleave forward and backward phases and store exact gradient information. Furthermore, BPTT has been shown to struggle to propagate gradient information for long sequences, leading to vanishing gradients. An alternative strategy to using gradient-based methods like BPTT involves stochastically approximating gradients through perturbation-based methods. This learning approach is exceptionally simple, necessitating only forward passes in the network and a global reinforcement signal as feedback. Despite its simplicity, the random nature of its updates typically leads to inefficient optimization, limiting its effectiveness in training neural networks. In this study, we present a new approach to perturbation-based learning in RNNs whose performance is competitive with BPTT, while maintaining the inherent advantages over gradient-based learning. To this end, we extend the recently introduced activity-based node perturbation (ANP) method to operate in the time domain, leading to more efficient learning and generalization. We subsequently conduct a range of experiments to validate our approach. Our results show similar performance, convergence time and scalability when compared to BPTT, strongly outperforming standard node perturbation and weight perturbation methods. These findings suggest that perturbation-based learning methods offer a versatile alternative to gradient-based methods for training RNNs which can be ideally suited for neuromorphic computing applications.
- Research Article
6
- 10.1088/2634-4386/ada851
- Feb 7, 2025
- Neuromorphic Computing and Engineering
Programming recurrent spiking neural networks (RSNNs) to robustly perform multi-timescale computation remains a difficult challenge. To address this, we describe a single-shot weight learning scheme to embed robust multi-timescale dynamics into attractor-based RSNNs, by exploiting the properties of high-dimensional distributed representations. We embed finite state machines into the RSNN dynamics by superimposing a symmetric autoassociative weight matrix and asymmetric transition terms, which are each formed by the vector binding of an input and heteroassociative outer-products between states. Our approach is validated through simulations with highly nonideal weights; an experimental closed-loop memristive hardware setup; and on Loihi 2, where it scales seamlessly to large state machines. This work introduces a scalable approach to embed robust symbolic computation through recurrent dynamics into neuromorphic hardware, without requiring parameter fine-tuning or significant platform-specific optimisation. Moreover, it demonstrates that distributed symbolic representations serve as a highly capable representation-invariant language for cognitive algorithms in neuromorphic hardware.
- Research Article
- 10.1038/s41598-026-35641-z
- Feb 18, 2026
- Scientific reports
Neuromorphic systems that employ advanced synaptic learning rules, such as the three-factor learning rule, require synaptic devices of increased complexity. Herein, a novel neoHebbian artificial synapse utilizing ReRAM devices has been proposed and experimentally validated to meet this demand. This synapse features two distinct state variables: a neuron coupling weight and an "eligibility trace" that dictates synaptic weight updates. The coupling weight is encoded in the ReRAM conductance, while the "eligibility trace" is encoded in the local temperature of the ReRAM and is modulated by applying voltage pulses to a physically co-located resistive heating element. The utility of the proposed synapse has been investigated using two representative tasks: first, temporal signal classification using Recurrent Spiking Neural Networks (RSNNs) employing the e-prop algorithm, and second, Reinforcement Learning (RL) for path planning tasks in feedforward networks using a modified version of the same learning rule. System-level simulations, accounting for various device and system-level non-idealities, confirm that these synapses offer a robust solution for the fast, compact, and energy-efficient implementation of advanced learning rules in neuromorphic hardware.
- Research Article
6
- 10.1109/embc.2017.8037463
- Jul 1, 2017
- Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference
Spiking neural networks are biologically plausible and power-efficient on neuromorphic hardware, while recurrent neural networks have been proven to be efficient on time series data. However, how to use the recurrent property to improve the performance of spiking neural networks is still a problem. This paper proposes a recurrent spiking neural network for character recognition using trajectories. In the network, a new encoding method is designed, in which varying time ranges of input streams are used in different recurrent layers. This is able to improve the generalization ability of our model compared with general encoding methods. The experiments are conducted on four groups of the character data set from University of Edinburgh. The results show that our method can achieve a higher average recognition accuracy than existing methods.
- Research Article
2
- 10.14569/ijacsa.2015.061101
- Jan 1, 2015
- International Journal of Advanced Computer Science and Applications
The prediction of the next serial criminal time is important in the field of criminology for preventing the recurring actions of serial criminals. In the associated dynamic systems, one of the main sources of instability and poor performances is the time delay, which is commonly predicted based on nonlinear methods. The aim of this study is to introduce a dynamic neural network model by using nonlinear autoregressive time series with exogenous (external) input (NARX) and Back Propagation Through Time (BPTT), which is verified intensively with MATLAB to predict and model the crime times for the next distance of serial cases. Recurrent neural networks have been extensively used for modeling of nonlinear dynamic systems. There are different types of recurrent neural networks such as Time Delay Neural Networks (TDNN), layer recurrent networks, NARX, and BPTT. The NARX model for the two cases of input- output modeling of dynamic systems and time series prediction draw more attention. In this study, a comparison of two models of NARX and BPTT used for the prediction of the next serial criminal time illustrates that the NARX model exhibits better performance for the prediction of serial cases than the BPTT model. Our future work aims to improve the NARX model by combining objective functions.
- Research Article
4
- 10.3389/fnins.2024.1412559
- Jun 20, 2024
- Frontiers in neuroscience
In neural circuits, recurrent connectivity plays a crucial role in network function and stability. However, existing recurrent spiking neural networks (RSNNs) are often constructed by random connections without optimization. While RSNNs can produce rich dynamics that are critical for memory formation and learning, systemic architectural optimization of RSNNs is still an open challenge. We aim to enable systematic design of large RSNNs via a new scalable RSNN architecture and automated architectural optimization. We compose RSNNs based on a layer architecture called Sparsely-Connected Recurrent Motif Layer (SC-ML) that consists of multiple small recurrent motifs wired together by sparse lateral connections. The small size of the motifs and sparse inter-motif connectivity leads to an RSNN architecture scalable to large network sizes. We further propose a method called Hybrid Risk-Mitigating Architectural Search (HRMAS) to systematically optimize the topology of the proposed recurrent motifs and SC-ML layer architecture. HRMAS is an alternating two-step optimization process by which we mitigate the risk of network instability and performance degradation caused by architectural change by introducing a novel biologically-inspired "self-repairing" mechanism through intrinsic plasticity. The intrinsic plasticity is introduced to the second step of each HRMAS iteration and acts as unsupervised fast self-adaptation to structural and synaptic weight modifications introduced by the first step during the RSNN architectural "evolution." We demonstrate that the proposed automatic architecture optimization leads to significant performance gains over existing manually designed RSNNs: we achieve 96.44% on TI46-Alpha, 94.66% on N-TIDIGITS, 90.28% on DVS-Gesture, and 98.72% on N-MNIST. To the best of the authors' knowledge, this is the first work to perform systematic architecture optimization on RSNNs.
- Conference Article
- 10.1109/iconscept66142.2025.11437374
- Dec 6, 2025
Electromyography (EMG) signals provide vital insights into muscular dynamics, making them indispensable for gesture recognition tasks. However, conventional deep learning models often suffer from high power consumption, limiting their deployment on neuromorphic edge hardware. This study proposes recurrent spiking neural networks (RSNNs) based framework for EMG classification with multiple neuron configurations involving Integrate and Fire (IF) and Leaky Integrate and Fire (LIF) neuron models. These configurations were analyzed to result in better trade-off between temporal stability and computational efficiency. Among various neuron cluster configurations within the reservoir, entirely IF neuron based network achieved the highest classification accuracy of 80%. This also outperforms existing deep learning based approaches. These results underscore the potential of RSNNs as an efficient framework for real-time EMG gesture recognition in neuromorphic hardware.
- Book Chapter
2
- 10.1007/978-3-030-86383-8_18
- Jan 1, 2021
Active Tuning is an optimization paradigm specifically designed to increase the robustness and generalization ability of temporal forward models like recurrent neural networks (RNNs). This work explores how the Active Tuning method can be used to optimize the internal dynamics of recurrent spiking neural networks (RSNNs). Active Tuning decouples the network from direct influence of the data stream and instead tunes its internal dynamics. This is based on the temporal gradient signals from propagating the error between outputs and observations backwards through time. Meanwhile, the network is running in a closed-loop prediction cycle, where the own output is used as the next input. As modern ANNs often demand excessive amounts of computational resources, spiking neural networks (SNNs) aim for the energy efficiency demonstrated by the human brain. This is accomplished by using an event-driven model inspired by the spiking behavior of biological neurons. Target of the Active Tuning optimization in RSNNs is the membrane potential of the neurons in the hidden layer. We show in two scenarios how RSNNs handle noisy inputs and that Active Tuning is a reliable method to increase their robustness as well as general prediction performance.