Adversarially Robust Continual Learning with Anti-Forgetting Loss

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Existing continual learning methods focus on preventing catastrophic forgetting but often overlook the challenge of adversarial examples in image classification. In this study, we propose a novel method that balances accuracy, robustness against adversarial examples, and the prevention of forgetting. Specifically, we first theoretically and experimentally demonstrate that learning through knowledge distillation, a common strategy in continual learning, conflicts with learning through the cross-entropy loss. To resolve this conflict, we propose a novel loss function that combines an additional memory data loss with a conflict-avoiding knowledge distillation loss, effectively preventing catastrophic forgetting while ensuring robustness. Experimental results show that the proposed method outperforms existing methods by 5.17% in clean accuracy and $2.10 \%$ in robust accuracy. This method proves to be especially beneficial in scenarios where the reuse of samples from previous tasks is limited.

Similar Papers
  • Research Article
  • Cite Count Icon 2
  • 10.1109/tpami.2024.3460871
Continual Learning From a Stream of APIs.
  • Dec 1, 2024
  • IEEE transactions on pattern analysis and machine intelligence
  • Enneng Yang + 7 more

Continual learning (CL) aims to learn new tasks without forgetting previous tasks. However, existing CL methods require a large amount of raw data, which is often unavailable due to copyright considerations and privacy risks. Instead, stakeholders usually release pre-trained machine learning models as a service (MLaaS), which users can access via APIs. This paper considers two practical-yet-novel CL settings: data-efficient CL (DECL-APIs) and data-free CL (DFCL-APIs), which achieve CL from a stream of APIs with partial or no raw data. Performing CL under these two new settings faces several challenges: unavailable full raw data, unknown model parameters, heterogeneous models of arbitrary architecture and scale, and catastrophic forgetting of previous APIs. To overcome these issues, we propose a novel data-free cooperative continual distillation learning framework that distills knowledge from a stream of APIs into a CL model by generating pseudo data, just by querying APIs. Specifically, our framework includes two cooperative generators and one CL model, forming their training as an adversarial game. We first use the CL model and the current API as fixed discriminators to train generators via a derivative-free method. Generators adversarially generate hard and diverse synthetic data to maximize the response gap between the CL model and the API. Next, we train the CL model by minimizing the gap between the responses of the CL model and the black-box API on synthetic data, to transfer the API's knowledge to the CL model. Furthermore, we propose a new regularization term based on network similarity to prevent catastrophic forgetting of previous APIs. Our method performs comparably to classic CL with full raw data on the MNIST and SVHN datasets in the DFCL-APIs setting. In the DECL-APIs setting, our method achieves 0.97×, 0.75× and 0.69× performance of classic CL on the more challenging CIFAR10, CIFAR100, and MiniImageNet, respectively.

  • Research Article
  • Cite Count Icon 9
  • 10.1162/neco_a_01615
Reducing Catastrophic Forgetting With Associative Learning: A Lesson From Fruit Flies.
  • Oct 10, 2023
  • Neural computation
  • Yang Shen + 2 more

Catastrophic forgetting remains an outstanding challenge in continual learning. Recently, methods inspired by the brain, such as continual representation learning and memory replay, have been used to combat catastrophic forgetting. Associative learning (retaining associations between inputs and outputs, even after good representations are learned) plays an important function in the brain; however, its role in continual learning has not been carefully studied. Here, we identified a two-layer neural circuit in the fruit fly olfactory system that performs continual associative learning between odors and their associated valences. In the first layer, inputs (odors) are encoded using sparse, high-dimensional representations, which reduces memory interference by activating nonoverlapping populations of neurons for different odors. In the second layer, only the synapses between odor-activated neurons and the odor's associated output neuron are modified during learning; the rest of the weights are frozen to prevent unrelated memories from being overwritten. We prove theoretically that these two perceptron-like layers help reduce catastrophic forgetting compared to the original perceptron algorithm, under continual learning. We then show empirically on benchmark data sets that this simple and lightweight architecture outperforms other popular neural-inspired algorithms when also using a two-layer feedforward architecture. Overall, fruit flies evolved an efficient continual associative learning algorithm, and circuit mechanisms from neuroscience can be translated to improve machine computation.

  • Research Article
  • Cite Count Icon 10
  • 10.1007/s10994-024-06524-z
From MNIST to ImageNet and back: benchmarking continual curriculum learning
  • Apr 22, 2024
  • Machine Learning
  • Kamil Faber + 5 more

Continual learning (CL) is one of the most promising trends in recent machine learning research. Its goal is to go beyond classical assumptions in machine learning and develop models and learning strategies that present high robustness in dynamic environments. This goal is realized by designing strategies that simultaneously foster the incorporation of new knowledge while avoiding forgetting past knowledge. The landscape of CL research is fragmented into several learning evaluation protocols, comprising different learning tasks, datasets, and evaluation metrics. Additionally, the benchmarks adopted so far are still distant from the complexity of real-world scenarios, and are usually tailored to highlight capabilities specific to certain strategies. In such a landscape, it is hard to clearly and objectively assess models and strategies. In this work, we fill this gap for CL on image data by introducing two novel CL benchmarks that involve multiple heterogeneous tasks from six image datasets, with varying levels of complexity and quality. Our aim is to fairly evaluate current state-of-the-art CL strategies on a common ground that is closer to complex real-world scenarios. We additionally structure our benchmarks so that tasks are presented in increasing and decreasing order of complexity—according to a curriculum—in order to evaluate if current CL models are able to exploit structure across tasks. We devote particular emphasis to providing the CL community with a rigorous and reproducible evaluation protocol for measuring the ability of a model to generalize and not to forget while learning. Furthermore, we provide an extensive experimental evaluation showing that popular CL strategies, when challenged with our proposed benchmarks, yield sub-par performance, high levels of forgetting, and present a limited ability to effectively leverage curriculum task ordering. We believe that these results highlight the need for rigorous comparisons in future CL works as well as pave the way to design new CL strategies that are able to deal with more complex scenarios.

  • Research Article
  • 10.69554/eiav2029
Learning management systems and online tools to support continuous workplace learning in academic libraries
  • Jun 1, 2023
  • Advances in Online Education: A Peer-Reviewed Journal
  • Jennifer Browning

In today’s evolving academic landscape, which has been made even more changeable by the COVID-19 global pandemic, library managers and administration must consider accessible and sustainable methods for providing continuous workplace learning programmes for library staff. Establishing sustainable continuous workplace learning and professional development initiatives is critical for library staff to remain confident in their service delivery and use of digital tools. It is possible for libraries to develop online continuous workplace learning programmes that employ an array of online tools that are already in use by the library, such as those used for course delivery, internal documentation and online communication. Specifically, as many libraries make use of learning management systems (LMS) to embed their information literacy programming for faculty and students, there is an opportunity to strategically use LMS to support professional development and continuous workplace learning for library staff. Drawing from examples for Carleton University Library, this paper explores how the use of an LMS and other online tools for continuous workplace learning can provide library staff with equitable online access to develop essential technical and practical skills, while helping to build a workplace culture that prioritises learning and skill development. Employing these tools in continuous learning and training programmes can allow libraries to become ‘learningful’ workplaces where staff at all levels are supported and are confident in their work.

  • Research Article
  • 10.3390/electronics14173345
Continual Graph Learning with Knowledge-Augmented Replay: A Case for Ethereum Phishing Detection
  • Aug 22, 2025
  • Electronics
  • Zonggui Tian + 1 more

Humans have the ability to incrementally learn, accumulate, update, and apply knowledge from dynamic environments. This capability, known as continual learning or lifelong learning, is also a long-term goal in the development of artificial intelligence. However, neural network-based continual learning suffers from catastrophic forgetting: the acquisition of new knowledge typically disrupts previously learned knowledge, leading to partial forgetting and a decline in the model’s overall performance. Most current continual learning methods can only mitigate catastrophic forgetting and fail to incrementally improve the overall performance. In this work, we aim to incrementally improve performance within sample incremental context by utilizing inter-stage edges as a pathway for explicit knowledge transfer in continual graph learning. Building on this pathway, we propose a knowledge-augmented replay method by leveraging evolving subgraphs of important nodes. This method enhances the distinction between patterns associated with different node classes and consolidates previously learned knowledge. Experiments on phishing detection in Ethereum transaction networks validate the effectiveness of the proposed method, demonstrating effective knowledge retention and augmentation while overcoming catastrophic forgetting and incrementally improving performance. The results also reveal the relationship between average accuracy and average forgetting. Lastly, we identify the key factor to incremental performance improvement, which lays a foundation for convergence of continual graph learning.

  • Research Article
  • Cite Count Icon 8
  • 10.1007/s10462-024-10924-x
Learning to learn for few-shot continual active learning
  • Sep 5, 2024
  • Artificial Intelligence Review
  • Stella Ho + 3 more

Continual learning strives to ensure stability in solving previously seen tasks while demonstrating plasticity in a novel domain. Recent advances in continual learning are mostly confined to a supervised learning setting, especially in NLP domain. In this work, we consider a few-shot continual active learning setting where labeled data are inadequate, and unlabeled data are abundant but with a limited annotation budget. We exploit meta-learning and propose a method, called Meta-Continual Active Learning. This method sequentially queries the most informative examples from a pool of unlabeled data for annotation to enhance task-specific performance and tackles continual learning problems through a meta-objective. Specifically, we employ meta-learning and experience replay to address inter-task confusion and catastrophic forgetting. We further incorporate textual augmentations to avoid memory over-fitting caused by experience replay and sample queries, thereby ensuring generalization. We conduct extensive experiments on benchmark text classification datasets from diverse domains to validate the feasibility and effectiveness of meta-continual active learning. We also analyze the impact of different active learning strategies on various meta continual learning models. The experimental results demonstrate that introducing randomness into sample selection is the best default strategy for maintaining generalization in meta-continual learning framework.

  • Conference Article
  • Cite Count Icon 10
  • 10.1109/ijcnn55064.2022.9892774
Targeted Data Poisoning Attacks Against Continual Learning Neural Networks
  • Jul 18, 2022
  • Huayu Li + 1 more

Continual (incremental) learning approaches are designed to address catastrophic forgetting in neural networks by training on batches or streaming data over time. In many real-world scenarios, the environments that generate streaming data are exposed to untrusted sources. These untrusted sources can be exposed to data poisoned by an adversary. The adversaries can manipulate and inject malicious samples into the training data. Thus, the untrusted data sources and malicious samples are meant to expose the vulnerabilities of neural networks that can lead to serious consequences in applications that require reliable performance. However, recent works on continual learning only focused on adversary agnostic scenarios without considering the possibility of data poisoning attacks. Further, recent work has demonstrated there are vulnerabilities of continual learning approaches in the presence of backdoor attacks with a relaxed constraint on manipulating data. In this paper, we focus on a more general and practical poisoning setting that artificially forces catastrophic forgetting by clean-label data poisoning attacks. We proposed a task targeted data poisoning attack that forces the neural network to forget the previous-learned knowledge, while the attack samples remain stealthy. The approach is benchmarked against three state-of-the-art continual learning algorithms on both domain and task incremental learning scenarios. The experiments demonstrate that the accuracy on targeted tasks significantly drops when the poisoned dataset is used in continual task learning.

  • Conference Article
  • Cite Count Icon 6
  • 10.1109/swc50871.2021.00046
Towards Online Continuous Reinforcement Learning on Industrial Internet of Things
  • Oct 1, 2021
  • Cheng Qian + 4 more

Training machine learning models, such as reinforcement learning models, require a significant investment of time, and a trained model can only work on a specific system in a specific environment. When the application scenario of reinforcement learning changes, or the application environment changes, the reinforcement learning model needs to be retrained. Thus, it is critical to design techniques that can reduce the overhead of retraining reinforcement learning models, enabling them adapt to constantly changing environments. In this paper, toward improving the performance of learning models in dynamic Industrial Internet of Things (IIoT), we propose an online continuous reinforcement learning strategy. In our process, when the retraining condition is triggered, our online continuous learning strategy will re-engage the training process and update the well-trained model. To evaluate the performance of our proposed approach, we categorize the entire application space for applying reinforcement learning to IIoT systems into four scenarios, namely, non-continuous learning without learning model sharing, non-continuous learning with learning model sharing, continuous learning without learning model sharing, and continuous learning without learning model sharing. For each scenario, we design a Q-learning based reinforcement learning algorithm. Via extensive evaluation, our results show that the online continuous reinforcement learning approach that we propose can significantly reduce the overhead of retraining the learning model, enabling the learning algorithm to quickly adapt to a changing environment.

  • Book Chapter
  • Cite Count Icon 5
  • 10.1007/978-3-030-88010-1_50
Continual Representation Learning via Auto-Weighted Latent Embeddings on Person ReID
  • Jan 1, 2021
  • Tianjun Huang + 2 more

Popular deep neural network models in artificial intelligence systems are found having catastrophic forgetting problem: when learning on a sequence of tasks, deep networks tend to only achieve high performance on the current task, while losing performance on previously learned tasks. This issue is often addressed by continual learning or lifelong learning. The majority of existing continual learning approaches adopt class incremental strategy, which will continuously expand the network structure. Representation learning, which only leverages the feature vector before classification layer, is able to maintain the model capacity in continual learning. However, recent continual representation learning methods are not well evaluated on unseen classes. In this paper, we pay attention to the performance of continual representation learning on unseen classes, and propose a novel auto-weighted latent embeddings method. For each task, autoencoders are developed to reconstruct feature maps from different levels in the neural network. The embeddings generated by these autoencoders on the manifolds are constrained when learning a new task so as to preserve the knowledge in previous tasks. An adapted auto-weighted approach is developed in this paper to assign different levels of importance to the embeddings based on reconstruction errors. Our experiments on three widely used Person Re-identification datasets expose the existence of catastrophic forgetting problem for representation learning on unseen classes, and demonstrate that our proposed method outperforms other related methods in continual representation learning setup.

  • Research Article
  • 10.12731/2077-1770-2024-16-4-457
CONTINUOUS LIFELONG LEARNING AS A PRACTICE OF «SELF-CARE»
  • Dec 30, 2024
  • Sovremennye issledovaniya sotsialnykh problem
  • Natalia N Balashova + 2 more

Background. The relevance of the presented topic should be associated with the existential situation characterized by the active development of the information society and its transition to a knowledge society, in which the values ​​associated with self-education and self-development of the individual prevail. Therefore, the strategy of continuous learning «throughout life» is in demand, allowing to satisfy the needs of the individual for self-development and self-improvement. The subject of this study is the concept of continuous lifelong learning, which acts as a practice of «self-care». Purpose. The main goal of the article is to explicate the axiological potential of continuous lifelong learning and to show that such learning acts primarily as «self-care», allowing the individual to develop and improve. Materials and methods. For the logic of development and solution of research problems, such methods as conceptual modeling, interpretation, and the method of contextual analysis were of great importance. The main provisions of the article contribute to the formation of a general conceptual idea of ​​the potential of continuous lifelong learning both for self-development and improvement of the individual, and for understanding the existence of the individual in the knowledge society. Results. The article raises the question: how or with the help of what can the strategy of continuous lifelong learning be implemented? It is stated that the demand for continuous lifelong learning of a large number of people can be satisfied only with the help of e-learning and distance learning technologies. And online courses, in particular massive open online courses (MOOCs), are the most popular and promising. It is argued that thanks to online courses, each person is potentially capable of creating a space for (self) education, and in a situation of personal readiness for continuous learning, MOOCs as a special case of online courses appear to be a phenomenon that contributes to the implementation of the practice of «self-care».

  • Research Article
  • Cite Count Icon 28
  • 10.1109/tcds.2022.3231055
Neural Manifold Modulated Continual Reinforcement Learning for Musculoskeletal Robots
  • Feb 1, 2024
  • IEEE Transactions on Cognitive and Developmental Systems
  • Jiahao Chen + 3 more

The continual learning and development are significant for robots to learn multiple tasks sequentially. The difficulty lies in balancing the efficient learning of new tasks and overcoming catastrophic forgetting of old tasks. Although many continual learning methods have been proposed for pattern recognition, continual reinforcement learning methods for redundant musculoskeletal and robotic systems are few and have limitations. Therefore, inspired by the developmental mechanisms in motor cortex, this article proposes a neural manifold modulated continual reinforcement learning method for musculoskeletal and robotic systems. First, a recurrent neural network (RNN) with an expected neural manifold is designed and conditions of weights are derived. Second, the ability of projectors for characterizing the neural manifold within RNN is analyzed. Furthermore, a continual reinforcement learning method of RNN is proposed with the modulation of a neural manifold. The method is validated by redundant musculoskeletal and robotic systems in simulation. The results suggest that it can realize continual reinforcement learning of multiple tasks in different movements and environments. Furthermore, compared with related works, the proposed method achieves better performance.

  • Research Article
  • Cite Count Icon 1
  • 10.1108/ir-03-2025-0104
Continual learning using GPT-4o and natural language processing: a robotic implementation for printed circuit board recycling
  • Aug 20, 2025
  • Industrial Robot: the international journal of robotics research and application
  • Genci Capi + 1 more

Purpose This study aims to develop a self-improving system that continuously improves performance with minimal human intervention. The system combines a CNN model, GPT-4o and user-assisted data set refinement via natural language to classify electronic components in a robotic printed circuit board (PCB) recycling scenario. Design/methodology/approach A CNN-based object detection model serves as the system’s core vision tool. When recognition confidence is low, the system engages GPT-4o and the user for classification through natural language input. The collected data are used to update the training set and re-train the model, enabling continuous performance improvement. A robotic manipulator evaluates the developed algorithm in real hardware implementation task. Findings Experimental results demonstrate that the proposed framework significantly improves classification accuracy over time. The integration of GPT-4o for interactive data refinement reduces manual labeling efforts while strengthening the system’s ability to identify and sort PCB recycling components accurately. Research limitations/implications Several challenges remain concerning the continual learning model and GPT-4o’s image recognition capabilities, which will be addressed to improve the system. Evaluations using Grad-Cam showed that even if the object’s features could not be grasped at first, it became possible to recognize it through continual learning. Therefore, it is expected that continual learning can improve accuracy for objects with poor object recognition accuracy. Practical implications Continuous incremental learning using speech recognition and GPT-4o was proposed in this paper. The incremental learning improved the recognition rates. The proposed algorithm is implemented to classify objects in the images for the recycling of circuit boards. The implementation demonstrated the practical value of continuous learning by enabling the robot to handle dynamic and diverse sorting tasks with greater efficiency and precision, reducing errors and improving overall operational performance. The proposed continual learning can be implemented in a wide range of applications. Originality/value The concept of continuous learning by using GPT-4o and natural language for data set updates is a novel contribution. In addition, the application of continual learning for robotic PCB manipulation is somewhat unique.

  • Conference Article
  • Cite Count Icon 4
  • 10.1109/icassp49357.2023.10095984
Is Multi-Task Learning an Upper Bound for Continual Learning?
  • Jun 4, 2023
  • Zihao Wu + 3 more

Continual learning and multi-task learning are commonly used machine learning techniques for learning from multiple tasks. However, existing literature assumes multi-task learning as a reasonable performance upper bound for various continual learning algorithms, without rigorous justification. Additionally, in a multi-task setting, a small subset of tasks may behave as adversarial tasks, negatively impacting overall learning performance. On the other hand, continual learning approaches can avoid the negative impact of adversarial tasks and maintain performance on the remaining tasks, resulting in better performance than multi-task learning. This paper introduces a novel continual self-supervised learning approach, where each task involves learning an invariant representation for a specific class of data augmentations. We demonstrate that this approach results in naturally contradicting tasks and that, in this setting, continual learning often outperforms multi-task learning on benchmark datasets, including MNIST, CIFAR-10, and CIFAR-100.

  • Research Article
  • Cite Count Icon 11
  • 10.1007/s10994-022-06283-9
Hierarchically structured task-agnostic continual learning
  • Dec 28, 2022
  • Machine Learning
  • Heinke Hihn + 1 more

One notable weakness of current machine learning algorithms is the poor ability of models to solve new problems without forgetting previously acquired knowledge. The Continual Learning paradigm has emerged as a protocol to systematically investigate settings where the model sequentially observes samples generated by a series of tasks. In this work, we take a task-agnostic view of continual learning and develop a hierarchical information-theoretic optimality principle that facilitates a trade-off between learning and forgetting. We derive this principle from a Bayesian perspective and show its connections to previous approaches to continual learning. Based on this principle, we propose a neural network layer, called the Mixture-of-Variational-Experts layer, that alleviates forgetting by creating a set of information processing paths through the network which is governed by a gating policy. Equipped with a diverse and specialized set of parameters, each path can be regarded as a distinct sub-network that learns to solve tasks. To improve expert allocation, we introduce diversity objectives, which we evaluate in additional ablation studies. Importantly, our approach can operate in a task-agnostic way, i.e., it does not require task-specific knowledge, as is the case with many existing continual learning algorithms. Due to the general formulation based on generic utility functions, we can apply this optimality principle to a large variety of learning problems, including supervised learning, reinforcement learning, and generative modeling. We demonstrate the competitive performance of our method on continual reinforcement learning and variants of the MNIST, CIFAR-10, and CIFAR-100 datasets.

  • Research Article
  • Cite Count Icon 6
  • 10.1117/1.jmi.9.3.034502
Impact of continuous learning on diagnostic breast MRI AI: evaluation on an independent clinical dataset.
  • Jun 6, 2022
  • Journal of Medical Imaging
  • Hui Li + 6 more

Purpose: We demonstrate continuous learning and assess its impact on the performance of artificial intelligence of breast dynamic contrast-enhanced magnetic resonance imaging in the task of distinguishing malignant from benign lesions on an independent clinical test dataset. Approach: The study included 1979 patients with 1990 lesions who underwent breast MR imaging during 2015, 2016, and 2017, retrospectively collected under an IRB-approved protocol; there were 1494 malignant and 496 benign lesions based on histopathology. AI was conducted in the task of distinguishing malignant and benign lesions, and independent testing was performed to assess the effect of increasing the numbers of training cases. Five training sets mimicking clinical implementation of continuous AI learning included cases from (1)first quarter of 2015, (2)first half of 2015, (3)all 2015, (4)all 2015 and first half of 2016, and (5)all 2015 and 2016. All classifiers were evaluated on the 2017 independent test set. The area under the ROC curve (AUC) served as the performance metric and was calculated over all lesions in the test set, as well as only mass lesions and only non-mass enhancements. The Mann-Kendall test was used to determine if continuous learning resulted in a positive trend in classification performance. was considered to be statistically significant. Results: Over the continuous training period, the selected feature subsets tended to become more similar and stable. Performance of the five training conditions on the independent test dataset yielded AUCs of 0.86 (95% CI: [0.83,0.90]), 0.87 (95% CI: [0.83,0.90]), 0.88 (95% CI: [0.84,0.91]), 0.89 (95% CI: [0.85,0.92]), and 0.89 (95% CI: [0.86,0.92]). The Mann-Kendall test indicated a statistically significant positive trend ( ) in classification performance with continuous learning. Conclusions: Improved diagnostic performance over time was observed when continuous learning of AI was implemented on an independent clinical test dataset.

Save Icon
Up Arrow
Open/Close