Abstract

In view of the problem of ignoring the importance of words in Tibetan text representation, this paper proposes a Tibetan text representation method combining Word2ve and improved TF-IDF. First of all, the method uses the Word2vec model to train all the word vectors of the text, which can capture the semantic information of the text. Secondly, the improved TF-IDF algorithm is used to calculate the weight of each word and word vector in the text. Fusion of Word2vec and improved TF-IDF algorithm to construct a Tibetan text vector representation model based on word vectors and weights. Finally, it uses the BiLSTM neural network model classifier to effectively classify the Tibetan text. The experimental results show that this method is better than the traditional method in the classification of Tibetan text, which verifies the effectiveness of the method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call