Abstract
Text classification is a pivotal task in NLP (Natural Language Processing), which has received widespread attention recently. Most of the existing methods leverage the power of deep learning to improve the performance of models. However, these models ignore the interaction information between all the sentences in a text when generating the current text representation, which results in a partial semantics loss. Labels play a central role in text classification. And the attention learned from text-label in the joint space of labels and words is not leveraged, leaving enough room for further improvement. In this paper, we propose a text classification method based on Self-Interaction attention mechanism and label embedding. Firstly, our method introduce BERT (Bidirectional Encoder Representation from Transformers) to extract text features. Then Self-Interaction attention mechanism is employed to obtain text representations containing more comprehensive semantics. Moreover, we focus on the embedding of labels and words in the joint space to achieve the dual-label embedding, which further leverages the attention learned from text-label. Finally, the texts are classified by the classifier according to the weighted labels representations. The experimental results show that our method outperforms other state-of-the-art methods in terms of classification accuracy.
Highlights
Text classification is an essential subtask in the field of NLP (Natural Language Processing), and the main goal is to annotate a given text sequence with one label or multiple labels describing the textual semantics
Neural network models are widely used in text representations which mainly include Convolutional Neural Network (CNN) [2], Long Short-Term Memory (LSTM) [3] and Recurrent Neural Network (RNN) [4] et al The above neural-based models only view the preceding texts as the context when producing
The attention learned from text-label in the joint space of labels and words is used to weight the text representations obtained through Self-Interaction Attention Mechanism
Summary
Text classification is an essential subtask in the field of NLP (Natural Language Processing), and the main goal is to annotate a given text sequence with one label or multiple labels describing the textual semantics. Wang et al [9] embedded labels and words in the joint space to learn and effectively improved the performance of text classification. We employ Self-Interaction Attention mechanism to obtain text representations containing comprehensive semantics for text classification. We embed the set of labels and words to learn in the joint space, and we further gain attention from text-label to weight the final label and text representation. Naming the model of our paper as Pre-trained Labels embedding and Self-Interaction Attention based text classification Model (P-LSIAM). We jointly combined Self-Interaction Attention Mechanism with Joint embedding learning of labels and words in text classification. The attention learned from text-label in the joint space of labels and words is used to weight the text representations obtained through Self-Interaction Attention Mechanism. The second use is in the text classification stage, texts are classified by the classifier according to the weighted labels representations
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.