Abstract

Text classification is a pivotal task in NLP (Natural Language Processing), which has received widespread attention recently. Most of the existing methods leverage the power of deep learning to improve the performance of models. However, these models ignore the interaction information between all the sentences in a text when generating the current text representation, which results in a partial semantics loss. Labels play a central role in text classification. And the attention learned from text-label in the joint space of labels and words is not leveraged, leaving enough room for further improvement. In this paper, we propose a text classification method based on Self-Interaction attention mechanism and label embedding. Firstly, our method introduce BERT (Bidirectional Encoder Representation from Transformers) to extract text features. Then Self-Interaction attention mechanism is employed to obtain text representations containing more comprehensive semantics. Moreover, we focus on the embedding of labels and words in the joint space to achieve the dual-label embedding, which further leverages the attention learned from text-label. Finally, the texts are classified by the classifier according to the weighted labels representations. The experimental results show that our method outperforms other state-of-the-art methods in terms of classification accuracy.

Highlights

  • Text classification is an essential subtask in the field of NLP (Natural Language Processing), and the main goal is to annotate a given text sequence with one label or multiple labels describing the textual semantics

  • Neural network models are widely used in text representations which mainly include Convolutional Neural Network (CNN) [2], Long Short-Term Memory (LSTM) [3] and Recurrent Neural Network (RNN) [4] et al The above neural-based models only view the preceding texts as the context when producing

  • The attention learned from text-label in the joint space of labels and words is used to weight the text representations obtained through Self-Interaction Attention Mechanism

Read more

Summary

INTRODUCTION

Text classification is an essential subtask in the field of NLP (Natural Language Processing), and the main goal is to annotate a given text sequence with one label or multiple labels describing the textual semantics. Wang et al [9] embedded labels and words in the joint space to learn and effectively improved the performance of text classification. We employ Self-Interaction Attention mechanism to obtain text representations containing comprehensive semantics for text classification. We embed the set of labels and words to learn in the joint space, and we further gain attention from text-label to weight the final label and text representation. Naming the model of our paper as Pre-trained Labels embedding and Self-Interaction Attention based text classification Model (P-LSIAM). We jointly combined Self-Interaction Attention Mechanism with Joint embedding learning of labels and words in text classification. The attention learned from text-label in the joint space of labels and words is used to weight the text representations obtained through Self-Interaction Attention Mechanism. The second use is in the text classification stage, texts are classified by the classifier according to the weighted labels representations

RELATED WORK
JOINT EMBEDDING LEARNING OF LABELS
THE EFFECT OF EACH PART ON MODEL Our model mainly consists of three parts
Findings
CONCLUSION AND FUTURE
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.