Abstract

Catastrophic forgetting is a well-known tendency in continual learning of a deep neural network to forget previously learned knowledge when optimizing for sequentially incoming tasks. To address the issue, several methods have been proposed in research on continual learning. However, these methods cannot preserve the previously learned knowledge when training for a new task. Moreover, these methods are susceptible to negative interference between tasks, which may lead to catastrophic forgetting. It even becomes increasingly severe when there exists a notable gap between the domains of tasks. This paper proposes a novel method of controlling gates to select a subset of parameters learned for old tasks, which are then used to optimize a new task while avoiding negative interference efficiently. The proposed approach executes the subset of old parameters that provides positive responses by evaluating the effect when the old and new parameters are used together. The execution or skipping of old parameters through the gates is based on several responses across the network. We evaluate the proposed method in different continual learning scenarios involving image classification datasets. The proposed method outperforms other competitive methods and requires fewer parameters than the state-of-the-art methods during inference by applying the proposed gating mechanism that selectively involves a set of old parameters that provides positive prior knowledge to newer tasks. Additionally, we further prove the effectiveness of the proposed method through various analyses.

Highlights

  • D EEP neural networks generally access the complete data of tasks when learning multiple tasks [1], [2]

  • The experiment based on ResNet-20 (WRN-28-2) for 20 tasks constructed by the random order was denoted as N −R−20 (W −R−20)

  • In this work, we have addressed the catastrophic forgetting issue in continual learning that prevents the efficient optimization of a deep neural network for sequential tasks

Read more

Summary

INTRODUCTION

D EEP neural networks generally access the complete data of tasks when learning multiple tasks [1], [2]. H. Jin et al.: Gating Mechanism in Deep Neural Networks for Resource-Efficient Continual Learning based strategy [9]–[11] generally introduces new learnable parameters when a current task is observed [9] or the network fails to achieve a predetermined criterion on loss or validation accuracy [16]. Unlike other kinds of strategies, the structural allocation approach assigns a disjoint set of parameters for a task and prevents the rewrite of previous parameters [12] In other words, such methods do not update the previous parameter sets and do not forget the knowledge of the previous tasks.

RELATED WORK
GATED NETWORK
FRAMEWORK
LOW-LEVEL RESPONSE
HIGH-LEVEL RESPONSE
EXPERIMENTS
IMAGENET-50 RESULTS
ANALYSIS
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.