Abstract
FastText is a text classification model by Facebook. As the model is simple in structure, it has the advantage of fast and efficient. However, when the model is used in Chinese text classification, the accurate rate will decrease. To this end, a Chinese FastText text classification method combing Term Frequency-Relevance Frequency (TF-RF) and improved random walk model is suggested in the paper. The method makes TF-R weight choice to N-gram processed dictionaries during the input stage of the FastText model, making semantic analysis by using Probabilistic Latent Semantic Analysis (PLSA), and supplements to feature words; then utilizes the improved random walk model to improve the accuracy, and the improved model is more suitable for Chinese text classification. The experiment result shows that improved model in the paper has a better effect to Chinese text classification.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have