Abstract

This paper studies continual learning (CL) of a sequence of aspect sentiment classification (ASC) tasks. Although some CL techniques have been proposed for document sentiment classification, we are not aware of any CL work on ASC. A CL system that incrementally learns a sequence of ASC tasks should address the following two issues: (1) transfer knowledge learned from previous tasks to the new task to help it learn a better model, and (2) maintain the performance of the models for previous tasks so that they are not forgotten. This paper proposes a novel capsule network based model called B-CL to address these issues. B-CL markedly improves the ASC performance on both the new task and the old tasks via forward and backward knowledge transfer. The effectiveness of B-CL is demonstrated through extensive experiments.

Highlights

  • Continual learning (CL) aims to incrementally learn a sequence of tasks

  • (1) It proposes the problem of task incremental learning for aspect sentiment classification (ASC). (2) It proposes a new model B-CL with a novel adapter Continual Learning Adapter (CLA) incorporated in a pre-trained BERT to enable ASC continual learning

  • The proposed B-CL deals with catastrophic forgetting (CF), and performs knowledge transfer to improve the performance of both the new and the old tasks

Read more

Summary

Introduction

Continual learning (CL) aims to incrementally learn a sequence of tasks. Once a task is learned, its training data is often discarded (Chen and Liu, 2018). A test case from any class may be presented to the model to classify without giving it any task information This setting is not applicable to ASC. The system needs to identify the shared knowledge that can be transferred to the new task to help it learn better and the task specific knowledge that needs to be protected to avoid forgetting of previous models. (2) It proposes a new model B-CL with a novel adapter CLA incorporated in a pre-trained BERT to enable ASC continual learning. The methods in (Shin et al, 2017; Kamra et al, 2017; Rostami et al, 2019; He and Jaeger, 2018) build data generators for previous tasks so that in learning the new task, they can use some generated data for previous tasks to help avoid forgetting As these methods are mainly for avoiding CF, after learning a sequence of tasks, their final models are typically worse than learning each task separately. A heuristic NB method was used in (Wang et al, 2019). Xia et al (2017) presented a LL approach based on voting of individual task

Related Work
Preliminary
Task Masks
Training
Hyperparameters
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.