Learning in deep neural networks and brains with similarity-weighted interleaved learning

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Understanding how the brain learns throughout a lifetime remains a long-standing challenge. In artificial neural networks (ANNs), incorporating novel information too rapidly results in catastrophic interference, i.e., abrupt loss of previously acquired knowledge. Complementary Learning Systems Theory (CLST) suggests that new memories can be gradually integrated into the neocortex by interleaving new memories with existing knowledge. This approach, however, has been assumed to require interleaving all existing knowledge every time something new is learned, which is implausible because it is time-consuming and requires a large amount of data. We show that deep, nonlinear ANNs can learn new information by interleaving only a subset of old items that share substantial representational similarity with the new information. By using such similarity-weighted interleaved learning (SWIL), ANNs can learn new information rapidly with a similar accuracy level and minimal interference, while using a much smaller number of old items presented per epoch (fast and data-efficient). SWIL is shown to work with various standard classification datasets (Fashion-MNIST, CIFAR10, and CIFAR100), deep neural network architectures, and in sequential learning frameworks. We show that data efficiency and speedup in learning new items are increased roughly proportionally to the number of nonoverlapping classes stored in the network, which implies an enormous possible speedup in human brains, which encode a high number of separate categories. Finally, we propose a theoretical model of how SWIL might be implemented in the brain.

Similar Papers
  • Research Article
  • Cite Count Icon 294
  • 10.1037/a0033812
Incorporating rapid neocortical learning of new schema-consistent information into complementary learning systems theory.
  • Nov 1, 2013
  • Journal of Experimental Psychology: General
  • James L Mcclelland

The complementary learning systems theory of the roles of hippocampus and neocortex (McClelland, McNaughton, & O'Reilly, 1995) holds that the rapid integration of arbitrary new information into neocortical structures is avoided to prevent catastrophic interference with structured knowledge representations stored in synaptic connections among neocortical neurons. Recent studies (Tse et al., 2007, 2011) showed that neocortical circuits can rapidly acquire new associations that are consistent with prior knowledge. The findings challenge the complementary learning systems theory as previously presented. However, new simulations extending those reported in McClelland et al. (1995) show that new information that is consistent with knowledge previously acquired by a putatively cortexlike artificial neural network can be learned rapidly and without interfering with existing knowledge; it is when inconsistent new knowledge is acquired quickly that catastrophic interference ensues. Several important features of the findings of Tse et al. (2007, 2011) are captured in these simulations, indicating that the neural network model used in McClelland et al. has characteristics in common with neocortical learning mechanisms. An additional simulation generalizes beyond the network model previously used, showing how the rate of change of cortical connections can depend on prior knowledge in an arguably more biologically plausible network architecture. In sum, the findings of Tse et al. are fully consistent with the idea that hippocampus and neocortex are complementary learning systems. Taken together, these findings and the simulations reported here advance our knowledge by bringing out the role of consistency of new experience with existing knowledge and demonstrating that the rate of change of connections in real and artificial neural networks can be strongly prior-knowledge dependent.

  • Conference Article
  • Cite Count Icon 15
  • 10.1109/iceconf57129.2023.10084252
Accurate Weather Forecasting for Rainfall Prediction Using Artificial Neural Network Compared with Deep Learning Neural Network
  • Jan 5, 2023
  • D Vasudeva Rayudu + 1 more

Aim: This study set out to determine how well AI approaches like Artificial Neural Networks (ANNs) and Deep Learning Neural Networks (DLNNs) might be used to forecast rainfall (DNN). These methods of weather prediction were tested and ranked in terms of their efficiency. Substances and Techniques: Group 1 uses a Deep Learning Neural Network (DNN) for analysis, whereas Group 2 uses an Artificial Neural Network (ANN) with a pre-test power (G-power) of 80% and an alpha error rate (Err) of 0.05 for sample test analysis. In all, 20 samples were identified; 10 from each category. According to the outcomes of the MATLAB simulations, the accuracy of the Deep Learning Neural Network (DNN) is 92.59%, while that of the Artificial Neural Network (ANN) is 95.68%. By means of SPSS statistical program, we find that the achieved accuracy ratio is 0.034 (p 0.05). This study concludes that the unique Artificial Neural Network (ANN) algorithm outperforms the Deep Learning Neural Network (DNN) method in predicting rainfall in the context of cutting-edge weather forecasting.

  • Research Article
  • Cite Count Icon 59
  • 10.1016/j.tics.2020.09.002
Artificial Intelligence and the Common Sense of Animals.
  • Oct 8, 2020
  • Trends in Cognitive Sciences
  • Murray Shanahan + 3 more

Artificial Intelligence and the Common Sense of Animals.

  • Book Chapter
  • Cite Count Icon 2
  • 10.1007/978-3-030-13773-1_9
Foundation of Deep Machine Learning in Neural Networks
  • Jan 1, 2019
  • Chih-Cheng Hung + 2 more

This chapter introduces several basic neural network models, which are used as the foundation for the further development of deep machine learning in neural networks. The deep machine learning is a very different approach in terms of feature extraction compared with the traditional feature extraction methods. This conventional feature extraction method has been widely used in the pattern recognition approach. The deep machine learning in neural networks is to automatically “learn” the feature extractors, instead of using human knowledge to design and build feature extractors in the pattern recognition approach. We will describe some typical neural network models that have been successfully used in image and video analysis. One type of the neural networks introduced here is called supervised learning such as the feed-forward multi-layer neural networks, and the other type is called unsupervised learning such as the Kohonen model (also called self-organizing map (SOM)). Both types are widely used in visual recognition before the nurture of the deep machine learning in the convolutional neural networks (CNN). Specifically, the following models will be introduced: (1) the basic neuron model and perceptron, (2) the traditional feed-forward multi-layer neural networks using the backpropagation, (3) the Hopfield neural networks, (4) Boltzmann machines, (5) Restricted Boltzmann machines and Deep Belief Networks, (6) Self-organizing Maps, and (7) the Cognitron and Neocognitron. Both Cognitron and Neocognitron are deep neural networks that can perform the self-organizing without any supervision. These models are the foundation for discussing texture classification by using deep neural networks models.

  • Book Chapter
  • Cite Count Icon 1
  • 10.1016/b978-0-443-15452-2.00011-x
Chapter 11 - Computational intelligence on medical imaging with artificial neural networks
  • Jan 1, 2025
  • Mining Biomedical Text, Images and Visual Features for Information Retrieval
  • Oznur Ozaltin + 1 more

Chapter 11 - Computational intelligence on medical imaging with artificial neural networks

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 82
  • 10.3390/rs13132638
Deep Neural Network Utilizing Remote Sensing Datasets for Flood Hazard Susceptibility Mapping in Brisbane, Australia
  • Jul 5, 2021
  • Remote Sensing
  • Bahareh Kalantar + 6 more

Large damages and losses resulting from floods are widely reported across the globe. Thus, the identification of the flood-prone zones on a flood susceptibility map is very essential. To do so, 13 conditioning factors influencing the flood occurrence in Brisbane river catchment in Australia (i.e., topographic, water-related, geological, and land use factors) were acquired for further processing and modeling. In this study, artificial neural networks (ANN), deep learning neural networks (DLNN), and optimized DLNN using particle swarm optimization (PSO) were exploited to predict and estimate the susceptible areas to the future floods. The significance of the conditioning factors analysis for the region highlighted that altitude, distance from river, sediment transport index (STI), and slope played the most important roles, whereas stream power index (SPI) did not contribute to the hazardous situation. The performance of the models was evaluated against the statistical tests such as sensitivity, specificity, the area under curve (AUC), and true skill statistic (TSS). DLNN and PSO-DLNN models obtained the highest values of sensitivity (0.99) for the training stage to compare with ANN. Moreover, the validations of specificity and TSS for PSO-DLNN recorded the highest values of 0.98 and 0.90, respectively, compared with those obtained by ANN and DLNN. The best accuracies by AUC were evaluated in PSO-DLNN (0.99 in training and 0.98 in testing datasets), followed by DLNN and ANN. Therefore, the optimized PSO-DLNN proved its robustness to compare with other methods.

  • Research Article
  • Cite Count Icon 81
  • 10.1016/j.jksuci.2020.01.016
Smart occupancy detection for road traffic parking using deep extreme learning machine
  • Feb 6, 2020
  • Journal of King Saud University - Computer and Information Sciences
  • Shahan Yamin Siddiqui + 3 more

Smart occupancy detection for road traffic parking using deep extreme learning machine

  • Research Article
  • Cite Count Icon 8
  • 10.1016/j.knosys.2023.110959
LLEDA—Lifelong Self-Supervised Domain Adaptation
  • Sep 9, 2023
  • Knowledge-Based Systems
  • Mamatha Thota + 2 more

Humans and animals have the ability to continuously learn new information over their lifetime without losing previously acquired knowledge. However, artificial neural networks struggle with this due to new information conflicting with old knowledge, resulting in catastrophic forgetting. The complementary learning systems (CLS) theory (McClelland and McNaughton, 1995; Kumaran et al. 2016) suggests that the interplay between hippocampus and neocortex systems enables long-term and efficient learning in the mammalian brain, with memory replay facilitating the interaction between these two systems to reduce forgetting. The proposed Lifelong Self-Supervised Domain Adaptation (LLEDA) framework draws inspiration from the CLS theory and mimics the interaction between two networks: a DA network inspired by the hippocampus that quickly adjusts to changes in data distribution and an SSL network inspired by the neocortex that gradually learns domain-agnostic general representations. LLEDA’s latent replay technique facilitates communication between these two networks by reactivating and replaying the past memory latent representations to stabilize long-term generalization and retention without interfering with the previously learned information. Extensive experiments demonstrate that the proposed method outperforms several other methods resulting in a long-term adaptation while being less prone to catastrophic forgetting when transferred to new domains.

  • Research Article
  • Cite Count Icon 4
  • 10.3390/e25030401
Detecting Information Relays in Deep Neural Networks
  • Feb 22, 2023
  • Entropy
  • Arend Hintze + 1 more

Deep learning of artificial neural networks (ANNs) is creating highly functional processes that are, unfortunately, nearly as hard to interpret as their biological counterparts. Identification of functional modules in natural brains plays an important role in cognitive and neuroscience alike, and can be carried out using a wide range of technologies such as fMRI, EEG/ERP, MEG, or calcium imaging. However, we do not have such robust methods at our disposal when it comes to understanding functional modules in artificial neural networks. Ideally, understanding which parts of an artificial neural network perform what function might help us to address a number of vexing problems in ANN research, such as catastrophic forgetting and overfitting. Furthermore, revealing a network’s modularity could improve our trust in them by making these black boxes more transparent. Here, we introduce a new information-theoretic concept that proves useful in understanding and analyzing a network’s functional modularity: the relay information . The relay information measures how much information groups of neurons that participate in a particular function (modules) relay from inputs to outputs. Combined with a greedy search algorithm, relay information can be used to identify computational modules in neural networks. We also show that the functionality of modules correlates with the amount of relay information they carry.

  • Research Article
  • Cite Count Icon 4
  • 10.1051/epjconf/201715806008
Deep Learning Neural Networks and Bayesian Neural Networks in Data Analysis
  • Jan 1, 2017
  • EPJ Web of Conferences
  • Andrey Chernoded + 3 more

\nMost of the modern analyses in high energy physics use signal-versus-background classification techniques of machine learning methods and neural networks in particular. Deep learning neural network is the most promising modern technique to separate signal and background and now days can be widely and successfully implemented as a part of physical analysis. In this article we compare Deep learning and Bayesian neural networks application as a classifiers in an instance of top quark analysis.\n

  • Research Article
  • Cite Count Icon 1
  • 10.2139/ssrn.3262650
Deep Learning Neural Networks as a Model of Saccadic Generation
  • Jan 1, 2018
  • SSRN Electronic Journal
  • Sofia Krasovskaya + 2 more

Approximately twenty years ago, Laurent Itti and Christof Koch created a model of saliency in visual attention in an attempt to recreate the work of biological pyramidal neurons by mimicking neurons with centre-surround receptive fields. The Saliency Model has launched many studies that contributed to the understanding of layers of vision and the sphere of visual attention. The aim of the current study is to improve this model by using an artificial neural network that generates saccades similar to how humans make saccadic eye movements. The proposed model uses a Leaky Integrate-and-Fire layer for temporal predictions, and replaces parallel feature maps with a deep learning neural network in order to create a generative model that is precise for both spatial and temporal predictions. Our deep neural network was able to predict eye movements based on unsupervised learning from raw image input, as well as supervised learning from fixation maps retrieved during an eye-tracking experiment conducted with 35 participants at later stages in order to train a 2D softmax layer. The results imply that it is possible to match the spatial and temporal distributions of the model to spatial and temporal human distributions.

  • Research Article
  • Cite Count Icon 272
  • 10.1016/j.procs.2017.09.045
Solar Irradiance Forecasting Using Deep Neural Networks
  • Jan 1, 2017
  • Procedia Computer Science
  • Ahmad Alzahrani + 3 more

Solar Irradiance Forecasting Using Deep Neural Networks

  • Conference Article
  • Cite Count Icon 46
  • 10.1109/icdsp.2015.7252029
Malaysia traffic sign recognition with convolutional neural network
  • Jul 1, 2015
  • Mian Mian Lau + 2 more

Traffic sign recognition system is an important subsystem in advanced driver assistance systems (ADAS) that assisting a driver to detect a critical driving scenario and subsequently making an immediate decision. Recently, deep architecture neural network is popular because it adapts well in various kind of scenarios, even those which were not used during training. Therefore, a deep architecture neural network is implemented to perform traffic sign classification in order to improve the traffic sign recognition rate. A comparative study for a deep and shallow architecture neural network is presented in this paper. Deep and shallow architecture neural network refer to convolutional neural network (CNN) and radial basis function neural network (RBFNN) respectively. In the simulation result, two types of training modes had been compared i.e. incremental training and batch training. Experimental results show that incremental training mode trains faster than batch training mode. The performance of the convolutional neural network is evaluated with the Malaysian traffic sign database and achieves 99% of the recognition rate.

  • Research Article
  • Cite Count Icon 31
  • 10.1111/coin.12236
Adaptive transfer learning in deep neural networks: Wind power prediction using knowledge transfer from region to region and between different task domains
  • Aug 9, 2019
  • Computational Intelligence
  • Aqsa Saeed Qureshi + 1 more

Transfer learning (TL) in deep neural networks is gaining importance because, in most of the applications, the labeling of data is costly and time consuming. Additionally, TL also provides an effective weight initialization strategy for deep neural networks. This paper introduces the idea of adaptive TL in deep neural networks (ATL‐DNN) for wind power prediction. Specifically, we show in case of wind power prediction that adaptive TL of the deep neural networks system can be adaptively modified as regards training on a different wind farm is concerned. The proposed ATL‐DNN technique is tested for short‐term wind power prediction, where continuously arriving information has to be exploited. Adaptive TL not only helps in providing good weight initialization, but also in utilizing the incoming data for effective learning. Additionally, the proposed ATL‐DNN technique is shown to transfer knowledge between different task domains (wind power to wind speed prediction) and from one region to another region. The simulation results show that the proposed ATL‐DNN technique achieves average values of 0.0637, 0.0986, and 0.0984 for the mean absolute error, root mean squared error, and standard deviation error, respectively.

  • Research Article
  • Cite Count Icon 247
  • 10.1016/j.neucom.2020.07.053
Survey on Deep Neural Networks in Speech and Vision Systems
  • Jul 26, 2020
  • Neurocomputing
  • M Alam + 4 more

Survey on Deep Neural Networks in Speech and Vision Systems

Save Icon
Up Arrow
Open/Close