Mask-guided BERT for few-shot text classification

Wenxiong Liao,Zihao Wu,Yiyang Zhang,Yuzhong Chen,David Liu,Xiaoke Huang,Xi Jiang,Sheng Li,Haixing Dai,Hongmin Cai,Xiang Li,Wei Liu,Quanzheng Li,Dajiang Zhu,Zhengliang Liu,Tianming Liu

doi:10.1016/j.neucom.2024.128576

Abstract

Transformer-based language models have achieved significant success in various domains. However, the data-intensive nature of the transformer architecture requires much labeled data, which is challenging in low-resource scenarios (i.e., few-shot learning (FSL)). The main challenge of FSL is the difficulty of training robust models on small amounts of samples, which frequently leads to overfitting. Here we present Mask-BERT, a simple and modular framework to help BERT-based architectures tackle FSL. The proposed approach fundamentally differs from existing FSL strategies such as prompt tuning and meta-learning. The core idea is to selectively apply masks on text inputs and filter out irrelevant information, which guides the model to focus on discriminative tokens that influence prediction results. In addition, to make the text representations from different categories more separable and the text representations from the same category more compact, we introduce a contrastive learning loss function. Experimental results on open-domain and medical-domain datasets demonstrate the effectiveness of Mask-BERT. Code and data are available at: github.com/WenxiongLiao/mask-bert

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Mask-guided BERT for few-shot text classification

Abstract

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Journal: Neurocomputing	Publication Date: Sep 13, 2024
Citations: 3

Similar Papers

Application of Transformer-Based Language Models to Detect Hate Speech in Social Media
Swapnanil Mukherjee ... Sujit Das
Journal of Computational and Cognitive Engineering | VOL. 2
Swapnanil Mukherjee, et. al.Swapnanil Mukherjee ... Sujit Das
17 Dec 2021
Journal of Computational and Cognitive Engineering | VOL. 2

How Is a “Kitchen Chair” like a “Farm Horse”? Exploring the Representation of Noun-Noun Compound Semantics in Transformer-based Language Models
Mark Ormerod ... Barry Devereux
Computational Linguistics | VOL. 50
Mark Ormerod, et. al.Mark Ormerod ... Barry Devereux
01 Mar 2024
Computational Linguistics | VOL. 50

Task-Specific Transformer-Based Language Models in Health Care: Scoping Review
Ha Na Cho ... Soyoung Ko
JMIR Medical Informatics | VOL. 12
Ha Na Cho, et. al.Ha Na Cho ... Soyoung Ko
18 Nov 2024
JMIR Medical Informatics | VOL. 12

Revisiting Learnable Affines for Batch Norm in Few-Shot Transfer Learning
Moslem Yazdanpanah ... Eugene Belilovsky
-
Moslem Yazdanpanah, et. al.Moslem Yazdanpanah ... Eugene Belilovsky
01 Jun 2022
01 Jun 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mask-guided BERT for few-shot text classification

Abstract

Talk to us

Similar Papers

More From: Neurocomputing