Label-Specific Feature Augmentation for Long-Tailed Multi-Label Text Classification

Pengyu Xu,Bing Liu,Jian Yu,Sijin Lu,Liping Jing,Lin Xiao

doi:10.1609/aaai.v37i9.26259

Abstract

Multi-label text classification (MLTC) involves tagging a document with its most relevant subset of labels from a label set. In real applications, labels usually follow a long-tailed distribution, where most labels (called as tail-label) only contain a small number of documents and limit the performance of MLTC. To facilitate this low-resource problem, researchers introduced a simple but effective strategy, data augmentation (DA). However, most existing DA approaches struggle in multi-label settings. The main reason is that the augmented documents for one label may inevitably influence the other co-occurring labels and further exaggerate the long-tailed problem. To mitigate this issue, we propose a new pair-level augmentation framework for MLTC, called Label-Specific Feature Augmentation (LSFA), which merely augments positive feature-label pairs for the tail-labels. LSFA contains two main parts. The first is for label-specific document representation learning in the high-level latent space, the second is for augmenting tail-label features in latent space by transferring the documents second-order statistics (intra-class semantic variations) from head labels to tail labels. At last, we design a new loss function for adjusting classifiers based on augmented datasets. The whole learning procedure can be effectively trained. Comprehensive experiments on benchmark datasets have shown that the proposed LSFA outperforms the state-of-the-art counterparts.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Label-Specific Feature Augmentation for Long-Tailed Multi-Label Text Classification

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Jun 26, 2023
Citations: 6

Similar Papers

EEG data augmentation: towards class imbalance problem in sleep staging tasks
Jiahao Fan ... Xinyu Jiang
Journal of Neural Engineering | VOL. 17
Jiahao Fan, et. al.Jiahao Fan ... Xinyu Jiang
01 Oct 2020
Journal of Neural Engineering | VOL. 17

Multi-Label Text Classification with Transfer Learning
Likhitha Yelamanchili
-
Likhitha YelamanchiliLikhitha Yelamanchili
15 Jun 2023
15 Jun 2023

ADA: An Attention-Based Data Augmentation Approach to Handle Imbalanced Textual Datasets
Amit Kumar Sah ... Muhammad Abulaish
-
Amit Kumar Sah, et. al.Amit Kumar Sah ... Muhammad Abulaish
01 Jan 2023
01 Jan 2023

Learning from Multiple Noisy Augmented Data Sets for Better Cross-Lingual Spoken Language Understanding
Yingmei Guo ... Jian Pei
-
Yingmei Guo, et. al.Yingmei Guo ... Jian Pei
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Label-Specific Feature Augmentation for Long-Tailed Multi-Label Text Classification

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence