Abstract

Text classification is a key technology in the field of information retrieval and data mining, which can effectively sort messy information and locate needed information. This paper focuses on the comparative study of depth neural network CNN model, RNN model and LS TM model on the effect of Tibetan text classification. Firstly, we train BiLSTM_CRF model to segment Tibetan categorized text. We construct a word vector space model to get word vectors by removing stop words, calculating word frequency and extracting feature words. Secondly, the word vector is transmitted to the classification model to train the Tibetan text classifier. Finally, we use the Tibetan text classifier to classify Tibetan texts. Experiments show that deep neural network has better classification effect than traditional text classification method when the amount of data is large. Among them, CNN classifier has the best classification effect. When the amount of data is small, the SVM model is effective.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call