Abstract

Neural networks have been widely used in the field of text classification, and have achieved good results on various Chinese datasets. However, for long text classification, there are a lot of redundant information in text data, and some of the redundant information may involve other topic information, which makes long text classification challenging. To solve the above problems, this paper proposes a new text classification model, called attention-based BiLSTM fused CNN with gating mechanism(ABLG-CNN). In ABLG-CNN, word2vec is used to train word vector representation. The attention mechanism is used to calculate context vector of words to derive keyword information. Bidirectional long short-term memory network(BiLSTM) captures context features. Based on this, convolutional neural network(CNN) captures topic salient features. In view of the possible existence of sentences involving other topic information in long texts, a gating mechanism is introduced to assign weights to BiLSTM and CNN output features to obtain text fusion features that are favorable for classification. ABLG-CNN can capture text context semantics and local phrase features, and perform experimental verification on two long text news datasets. The experimental results show that ABLG-CNN’s classification performance is better than other latest text classification methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.