A Hybrid Bidirectional Recurrent Convolutional Neural Network Attention-Based Model for Text Classification

Jin Zheng,Limin Zheng

doi:10.1109/access.2019.2932619

Abstract

The text classification task is an important application in natural language processing. At present, deep learning models, such as convolutional neural network and recurrent neural network, have achieved good results for this task, but the multi-class text classification and the fine-grained sentiment analysis are still challenging. In this paper, we propose a hybrid bidirectional recurrent convolutional neural network attention-based model to address this issue, which named BRCAN. The model combines the bidirectional long short-term memory and the convolutional neural network with the attention mechanism and word2vec to achieve the fine-grained text classification task. In our model, we apply word2vec to generate word vectors automatically and a bidirectional recurrent structure to capture contextual information and long-term dependence of sentences. We also employ a maximum pool layer of convolutional neural network that judges which words play an essential role in text classification, and use the attention mechanism to give them higher weights to capture the key components in texts. We conduct experiments on four datasets, including Yahoo! Answers, Sogou News of the topic classification, Yelp Reviews, and Douban Movies Top250 short reviews of the sentiment analysis. And the experimental results show that the BRCAN outperforms the state-of-the-art models.

Highlights

Text classification is an essential component in many natural language processing applications, such as topic classification [1] and sentiment analysis [2], [3]
The contributions of this paper can be summarized as follows: 1) This paper proposes a hybrid framework, which utilizes Bi-long short-term memory [7] (LSTM) to capture the contextual information and long-term dependence of sentences, picks the useful local features from the sequences generated by the bidirectional long short-term memory (Bi-LSTM) according to the convolution and maximum pooling operations, and assigns different weights according to its importance by the attention mechanism to realize text classification effectively
The intermediate sentence feature representations generated by Bi-LSTM are input into Convolutional Neural Network [6] (CNN) layer to capture the local features of sentences, a maximum pooling layer of CNN is employed to judge which words play the key role in text classification through contextual information

Summary

INTRODUCTION

Text classification is an essential component in many natural language processing applications, such as topic classification [1] and sentiment analysis [2], [3]. CNN has been proved to be able to learn local features from words or phrases [6], it uses windows to acquire the most prominent features in sentences, and attempts to extract effective text representation by identifying the most influential n-grams of different semantic aspects It usually trains faster than RNN, but its ability to capturing features over long distances is poorer. The contributions of this paper can be summarized as follows: 1) This paper proposes a hybrid framework, which utilizes Bi-LSTM to capture the contextual information and long-term dependence of sentences, picks the useful local features from the sequences generated by the Bi-LSTM according to the convolution and maximum pooling operations, and assigns different weights according to its importance by the attention mechanism to realize text classification effectively. A large number of experiments have been conducted to verify the effect of BRCAN on text classification, analyze the performance of BRCAN on different datasets, the influence of attention mechanism on the model, and conduct a sensitivity analysis of convolutional layer, filter, and maximum pooling size

RELATED RESEARCH

BIDIRECTIONAL RECURRENT LAYER

CONVOLUTIONAL LAYER

RESULT

Findings

CONCLUSION