Abstract

In multi-label text classification (MLTC), each given document is associated with a set of correlated labels. To capture label correlations, previous classifier-chain and sequence-to-sequence models transform MLTC to a sequence prediction task. However, they tend to suffer from label order dependency, label combination over-fitting and error propagation problems. To address these problems, we introduce a novel approach with multi-task learning to enhance label correlation feedback. We first utilize a joint embedding (JE) mechanism to obtain the text and label representation simultaneously. In MLTC task, a document-label cross attention (CA) mechanism is adopted to generate a more discriminative document representation. Furthermore, we propose two auxiliary label co-occurrence prediction tasks to enhance label correlation learning: 1) Pairwise Label Co-occurrence Prediction (PLCP), and 2) Conditional Label Co-occurrence Prediction (CLCP). Experimental results on AAPD and RCV1-V2 datasets show that our method outperforms competitive baselines by a large margin. We analyze low-frequency label performance, label dependency, label combination diversity and coverage speed to show the effectiveness of our proposed method on label correlation learning.

Highlights

  • Multi-label text classification (MLTC) is an important natural language processing task with applications in text categorization, information retrieval, web mining, and many other real-world scenarios (Zhang and Zhou, 2014; Liu et al, 2020)

  • In order to train the model to understand second-order label relationships, we propose a binarized label-pair prediction task named as Pairwise Label Co-occurrence Prediction (PLCP) that can be trivially generated from the multi-label classification corpus

  • Our basic model LACO training only by the MLTC task significantly improves previous results on hamming loss and Micro-F1

Read more

Summary

Introduction

Multi-label text classification (MLTC) is an important natural language processing task with applications in text categorization, information retrieval, web mining, and many other real-world scenarios (Zhang and Zhou, 2014; Liu et al, 2020). Previous sequence-to-sequence (Seq2Seq) based methods (Nam et al, 2017; Yang et al, 2018) have been shown to have a powerful ability to capture label correlations with using the current hidden state of the model and the prefix label predictions. Seq2Seq-based methods heavily rely on a predefined ordering of labels and perform sensitively to the label order (Vinyals et al.; Yang et al, 2019; Qin et al, 2019). The Seq2Seq-based methods suffer from low generalization ability problem since they tend to overfit the label combinations in the training set and have difficulty to generate the unseen label combination. The errors may propagate during the inference stage where true previous target labels are unavailable and are replaced by labels generated by the model itself

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.