Abstract
At present, most recurrent neural network models used in text classification are shallow models and have limited ability to express texts especially large scale texts. This paper conducts an empirical study on the use of character-level deep recurrent neural network (Char-RNN) for Chinese corpus text classification. Firstly, it uses character-level features as input, and then uses a multilayer recurrent neural network structure to complete feature extraction. The evaluations on THUCNews dataset that is large scale Chinese news corpus showed that our proposed model is able to reach 94.4% accuracy, which performs better than the traditional models such as LibSVM(A Library for Support Vector Machines),CBOW(Continuous Bag-of-Words),CWE(char-acter enhanced word embedding) and deep learning models such as recurrent neural network on large-scale Chinese text classification mission.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have