CLeaR: An adaptive continual learning framework for regression tasks

  • Abstract
  • Highlights & Summary
  • PDF
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Catastrophic forgetting means that a trained neural network model gradually forgets the previously learned tasks when being retrained on new tasks. Overcoming the forgetting problem is a major problem in machine learning. Numerous continual learning algorithms are very successful in incremental learning of classification tasks, where new samples with their labels appear frequently. However, there is currently no research that addresses the catastrophic forgetting problem in regression tasks as far as we know. This problem has emerged as one of the primary constraints in some applications, such as renewable energy forecasts. This article clarifies problem-related definitions and proposes a new methodological framework that can forecast targets and update itself by means of continual learning. The framework consists of forecasting neural networks and buffers, which store newly collected data from a non-stationary data stream in an application. The changed probability distribution of the data stream, which the framework has identified, will be learned sequentially. The framework is called CLeaR (Continual Learning for Regression Tasks), where components can be flexibly customized for a specific application scenario. We design two sets of experiments to evaluate the CLeaR framework concerning fitting error (training), prediction error (test), and forgetting ratio. The first one is based on an artificial time series to explore how hyperparameters affect the CLeaR framework. The second one is designed with data collected from European wind farms to evaluate the CLeaR framework’s performance in a real-world application. The experimental results demonstrate that the CLeaR framework can continually acquire knowledge in the data stream and improve the prediction accuracy. The article concludes with further research issues arising from requirements to extend the framework.

Similar Papers
  • Conference Article
  • Cite Count Icon 85
  • 10.1109/cvprw53098.2021.00399
Avalanche: an End-to-End Library for Continual Learning
  • Jun 1, 2021
  • Vincenzo Lomonaco + 27 more

Learning continually from non-stationary data streams is a long-standing goal and a challenging problem in machine learning. Recently, we have witnessed a renewed and fast-growing interest in continual learning, especially within the deep learning community. However, algorithmic solutions are often difficult to re-implement, evaluate and port across different settings, where even results on standard benchmarks are hard to reproduce. In this work, we propose Avalanche, an open-source end-to-end library for continual learning research based on PyTorch. Avalanche is designed to provide a shared and collaborative codebase for fast prototyping, training, and reproducible evaluation of continual learning algorithms.

  • Research Article
  • 10.1109/tpami.2025.3614868
Schedule-Robust Continual Learning.
  • Jan 1, 2025
  • IEEE transactions on pattern analysis and machine intelligence
  • Ruohan Wang + 3 more

Continual learning (CL) tackles a fundamental challenge in machine learning, aiming to continuously learn novel data from non-stationary data streams while mitigating forgetting of previously learned data. Although existing CL algorithms have introduced various practical techniques for combating forgetting, little attention has been devoted to studying how data schedules - which dictate how the sample distribution of a data stream evolves over time - affect the CL problem. Empirically, most CL methods are susceptible to schedule changes: they exhibit markedly lower accuracy when dealing with more "difficult schedules over the same underlying training data. In practical scenarios, data schedules are often unknown and a key challenge is thus to design CL methods that are robust to diverse schedules to ensure model reliability. In this work, we introduce the novel concept of schedule robustness for CL and propose Schedule-Robust Continual Learning (SCROLL), a strong baseline satisfying this desirable property. SCROLL trains a linear classifier on a suitably pre-trained representation, followed by model adaptation using replay data only. We connect SCROLL to a meta-learning formulation of CL with provable guarantees on schedule robustness. Empirically, the proposed method significantly outperforms existing CL methods and we provide extensive ablations to highlight its properties.

  • Conference Article
  • Cite Count Icon 125
  • 10.1109/iccv48922.2021.00814
Continual Prototype Evolution: Learning Online from Non-Stationary Data Streams
  • Oct 1, 2021
  • Matthias De Lange + 1 more

Attaining prototypical features to represent class distributions is well established in representation learning. However, learning prototypes online from streaming data proves a challenging endeavor as they rapidly become outdated, caused by an ever-changing parameter space during the learning process. Additionally, continual learning does not assume the data stream to be stationary, typically resulting in catastrophic forgetting of previous knowledge. As a first, we introduce a system addressing both problems, where prototypes evolve continually in a shared latent space, enabling learning and prediction at any point in time. To facilitate learning, a novel objective function synchronizes the latent space with the continually evolving prototypes. In contrast to the major body of work in continual learning, data streams are processed in an online fashion without task information and can be highly imbalanced, for which we propose an efficient memory scheme. As an additional contribution, we propose the learner-evaluator framework that i) generalizes existing paradigms in continual learning, ii) introduces data incremental learning, and iii) models the bridge between continual learning and concept drift. We obtain state-of-the-art performance by a significant margin on eight benchmarks, including three highly imbalanced data streams. Code is publicly available. <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup>

  • PDF Download Icon
  • Conference Article
  • Cite Count Icon 3
  • 10.1145/3543507.3583262
Continual Few-shot Learning with Transformer Adaptation and Knowledge Regularization
  • Apr 30, 2023
  • Xin Wang + 5 more

Continual few-shot learning, as a paradigm that simultaneously solves continual learning and few-shot learning, has become a challenging problem in machine learning. An eligible continual few-shot learning model is expected to distinguish all seen classes upon new categories arriving, where each category only includes very few labeled data. However, existing continual few-shot learning methods only consider the visual modality, where the distributions of new categories often indistinguishably overlap with old categories, thus resulting in the severe catastrophic forgetting problem. To tackle this problem, in this paper we study continual few-shot learning with the assistance of semantic knowledge by simultaneously taking both visual modality and semantic concepts of categories into account. We propose a Continual few-shot learning algorithm with Semantic knowledge Regularization (CoSR) for adapting to the distribution changes of visual prototypes through a Transformer-based prototype adaptation mechanism. Specifically, the original visual prototypes from the backbone are fed into the well-designed Transformer with corresponding semantic concepts, where the semantic concepts are extracted from all categories. The semantic-level regularization forces the categories with similar semantics to be closely distributed, while the opposite ones are constrained to be far away from each other. The semantic regularization improves the model’s ability to distinguish between new and old categories, thus significantly mitigating the catastrophic forgetting problem in continual few-shot learning. Extensive experiments on CIFAR100, miniImageNet, CUB200 and an industrial dataset with long-tail distribution demonstrate the advantages of our CoSR model compared with state-of-the-art methods.

  • Research Article
  • Cite Count Icon 6
  • 10.1016/j.neucom.2021.05.026
Long short-term memory self-adapting online random forests for evolving data stream regression
  • May 18, 2021
  • Neurocomputing
  • Yuan Zhong + 4 more

Long short-term memory self-adapting online random forests for evolving data stream regression

  • Conference Article
  • Cite Count Icon 8
  • 10.1109/ijcnn55064.2022.9892774
Targeted Data Poisoning Attacks Against Continual Learning Neural Networks
  • Jul 18, 2022
  • Huayu Li + 1 more

Continual (incremental) learning approaches are designed to address catastrophic forgetting in neural networks by training on batches or streaming data over time. In many real-world scenarios, the environments that generate streaming data are exposed to untrusted sources. These untrusted sources can be exposed to data poisoned by an adversary. The adversaries can manipulate and inject malicious samples into the training data. Thus, the untrusted data sources and malicious samples are meant to expose the vulnerabilities of neural networks that can lead to serious consequences in applications that require reliable performance. However, recent works on continual learning only focused on adversary agnostic scenarios without considering the possibility of data poisoning attacks. Further, recent work has demonstrated there are vulnerabilities of continual learning approaches in the presence of backdoor attacks with a relaxed constraint on manipulating data. In this paper, we focus on a more general and practical poisoning setting that artificially forces catastrophic forgetting by clean-label data poisoning attacks. We proposed a task targeted data poisoning attack that forces the neural network to forget the previous-learned knowledge, while the attack samples remain stealthy. The approach is benchmarked against three state-of-the-art continual learning algorithms on both domain and task incremental learning scenarios. The experiments demonstrate that the accuracy on targeted tasks significantly drops when the poisoned dataset is used in continual task learning.

  • Research Article
  • Cite Count Icon 14
  • 10.1097/tp.0000000000003316
A Primer on Machine Learning.
  • Aug 18, 2020
  • Transplantation
  • Audrene S Edwards + 2 more

A Primer on Machine Learning.

  • Research Article
  • Cite Count Icon 3
  • 10.1016/j.datak.2019.101718
Online density estimation over high-dimensional stationary and non-stationary data streams
  • Jul 22, 2019
  • Data &amp; Knowledge Engineering
  • Aref Majdara + 1 more

Online density estimation over high-dimensional stationary and non-stationary data streams

  • Book Chapter
  • Cite Count Icon 6
  • 10.1007/978-3-030-43887-6_30
Data Preprocessing and Dynamic Ensemble Selection for Imbalanced Data Stream Classification
  • Jan 1, 2020
  • Paweł Zyblewski + 2 more

Learning from the non-stationary imbalanced data stream is a serious challenge to the machine learning community. There is a significant number of works addressing the issue of classifying non-stationary data stream, but most of them do not take into consideration that the real-life data streams may exhibit high and changing class imbalance ratio, which may complicate the classification task. This work attempts to connect two important, yet rarely combined, research trends in data analysis, i.e., non-stationary data stream classification and imbalanced data classification. We propose a novel framework for training base classifiers and preparing the dynamic selection dataset (DSEL) to integrate data preprocessing and dynamic ensemble selection (DES) methods for imbalanced data stream classification. The proposed approach has been evaluated on the basis of computer experiments carried out on 72 artificially generated data streams with various imbalance ratios, levels of label noise and types of concept drift. In addition, we consider six variations of preprocessing methods and four DES methods. Experimentation results showed that dynamic ensemble selection, even without the use of any data preprocessing, can outperform a naive combination of the whole pool generated with the use of preprocessing methods. Combining DES with preprocessing further improves the obtained results.

  • Discussion
  • Cite Count Icon 64
  • 10.1016/s2589-7500(21)00076-5
Continual learning in medical devices: FDA's action plan and beyond
  • Apr 28, 2021
  • The Lancet Digital Health
  • Kerstin N Vokinger + 2 more

Continual learning in medical devices: FDA's action plan and beyond

  • Research Article
  • Cite Count Icon 11
  • 10.1016/j.ins.2023.119411
Dynamically evolving deep neural networks with continuous online learning
  • Jul 24, 2023
  • Information Sciences
  • Yuan Zhong + 3 more

Dynamically evolving deep neural networks with continuous online learning

  • Research Article
  • Cite Count Icon 1
  • 10.3390/sym17020182
Multi-Label Learning with Distribution Matching Ensemble: An Adaptive and Just-In-Time Weighted Ensemble Learning Algorithm for Classifying a Nonstationary Online Multi-Label Data Stream
  • Jan 24, 2025
  • Symmetry
  • Chao Shen + 6 more

Learning from a nonstationary data stream is challenging, as a data stream is generally considered to be endless, and the learning model is required to be constantly amended for adapting the shifting data distributions. When it meets multi-label data, the challenge would be further intensified. In this study, an adaptive online weighted multi-label ensemble learning algorithm called MLDME (multi-label learning with distribution matching ensemble) is proposed. It simultaneously calculates both the feature matching level and label matching level between any one reserved data block and the new received data block, further providing an adaptive decision weight assignment for ensemble classifiers based on their distribution similarities. Specifically, MLDME abandons the most commonly used but not totally correct underlying hypothesis that in a data stream, each data block always has the most approximate distribution with that emerging after it; thus, MLDME could provide a just-in-time decision for the new received data block. In addition, to avoid an infinite extension of ensemble classifiers, we use a fixed-size buffer to store them and design three different dynamic classifier updating rules. Experimental results for nine synthetic and three real-world multi-label nonstationary data streams indicate that the proposed MLDME algorithm is superior to some popular and state-of-the-art online learning paradigms and algorithms, including two specifically designed ones for classifying a nonstationary multi-label data stream.

  • Research Article
  • Cite Count Icon 4
  • 10.11591/ijece.v6i4.pp1811-1817
Incremental Learning on Non-stationary Data Stream using Ensemble Approach
  • Aug 1, 2016
  • Meenakshi Anurag Thalor + 1 more

&lt;span lang="EN-US"&gt;Incremental Learning on non stationary distribution has been shown to be a very challenging problem in machine learning and data mining, because the joint probability distribution between the data and classes changes over time. Many real time problems suffer concept drift as they changes with time. For example, an advertisement recommendation system, in which customer’s behavior may change depending on the season of the year, on the inflation and on new products made available. An extra challenge arises when the classes to be learned are not represented equally in the training data i.e. classes are imbalanced, as most machine learning algorithms work well only when the training data is balanced. The objective of this paper is to develop an ensemble based classification algorithm for non-stationary data stream (ENSDS) with focus on two-class problems. In addition, we are presenting here an exhaustive comparison of purposed algorithms with state-of-the-art classification approaches using different evaluation measures like recall, f-measure and g-mean&lt;/span&gt;

  • Research Article
  • Cite Count Icon 1
  • 10.11591/ijece.v6i4.10255
Incremental Learning on Non-stationary Data Stream using Ensemble Approach
  • Aug 1, 2016
  • Meenakshi Anurag Thalor + 1 more

&lt;span lang="EN-US"&gt;Incremental Learning on non stationary distribution has been shown to be a very challenging problem in machine learning and data mining, because the joint probability distribution between the data and classes changes over time. Many real time problems suffer concept drift as they changes with time. For example, an advertisement recommendation system, in which customer’s behavior may change depending on the season of the year, on the inflation and on new products made available. An extra challenge arises when the classes to be learned are not represented equally in the training data i.e. classes are imbalanced, as most machine learning algorithms work well only when the training data is balanced. The objective of this paper is to develop an ensemble based classification algorithm for non-stationary data stream (ENSDS) with focus on two-class problems. In addition, we are presenting here an exhaustive comparison of purposed algorithms with state-of-the-art classification approaches using different evaluation measures like recall, f-measure and g-mean&lt;/span&gt;

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/etfa52439.2022.9921432
Development of a Framework for Continual Learning in Industrial Robotics
  • Sep 6, 2022
  • Minh Trinh + 5 more

Continual learning (CL) is a machine learning (ML) paradigm for learning continually from non-stationary data streams while simultaneously transferring and protecting past knowledge. Therefore, CL avoids catastrophic forgetting, a common problem that arises when training ML-models on new data. This paper presents a CL framework for data-driven learning of the dynamics model of a 6-degree-of-freedom serial industrial robot. This model can be used for model-based control algorithms, without the need for extensive identification of robot specific parameters such as mass inertia, and can additionally model complex effects such as friction. Furthermore, using CL, it can adapt to changes of the robot e.g., due to wear or new tasks. With the help of CL, the ML-based dynamics model is continually fed new data and improves over the operating period of the robot.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon
Setting-up Chat
Loading Interface