Abstract
Text classification is of importance in natural language processing, as the massive text information containing huge amounts of value needs to be classified into different categories for further use. In order to better classify text, our paper tries to build a deep learning model which achieves better classification results in Chinese text than those of other researchers’ models. After comparing different methods, long short-term memory (LSTM) and convolutional neural network (CNN) methods were selected as deep learning methods to classify Chinese text. LSTM is a special kind of recurrent neural network (RNN), which is capable of processing serialized information through its recurrent structure. By contrast, CNN has shown its ability to extract features from visual imagery. Therefore, two layers of LSTM and one layer of CNN were integrated to our new model: the BLSTM-C model (BLSTM stands for bi-directional long short-term memory while C stands for CNN.) LSTM was responsible for obtaining a sequence output based on past and future contexts, which was then input to the convolutional layer for extracting features. In our experiments, the proposed BLSTM-C model was evaluated in several ways. In the results, the model exhibited remarkable performance in text classification, especially in Chinese texts.
Highlights
Motivated by the development of Internet technology and the progress of mobile social networking platforms, the amount of textual information is growing rapidly on the Internet
To validate the ability of our model on different languages, our bidirectional long short-term memory (BLSTM)-C model is compared with a simple long short-term memory (LSTM) model on the English news dataset as well as the Chinese news dataset that has the same categories as the English one
This paper mainly introduces a combined model called BLSTM-C that is made up of a bi-directional LSTM layer and a convolutional layer
Summary
Motivated by the development of Internet technology and the progress of mobile social networking platforms, the amount of textual information is growing rapidly on the Internet. Machine-learning-based methods, including naive Bayes, support vector machine, and k-nearest neighbors, are generally adopted by traditional text classification. Their performance depends mainly on the quality of hand-crafted features. In terms of natural language processing, CNN is able to extract n-gram features from different positions of a sentence through convolutional filters and it learns both short- and long-range relations through the operations of pooling. The BLSTM is employed firstly to capture the long-term sentence dependencies and CNN is adopted to extract features for sequence modeling tasks. It turns out that our model is more suitable for the Chinese language It is shown through our evaluation that our BLSTM-C model achieves remarkable results and it outperforms a wide range of baseline models
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have