Abstract

With the continuous development of the Internet, social media based on short text has become popular. However, the sparsity and shortness of essays will restrict the accuracy of text classification. Therefore, based on the Bert model, we capture the mental feature of reviewers and apply them for short text classification to improve its classification accuracy. Specifically, we construct a model text at the language level and fine tune the model to better embed mental features. To verify the accuracy of this method, we compare a variety of machine learning methods, such as support vector machine, convolution neural networks, and recurrent neural networks. The results show the following: (1) Through feature comparison, it is found that mental features can significantly improve the accuracy of short text classification. (2) Combining mental features and text as input vectors can provide more classification accuracy than separating them as two independent vectors. (3) Through model comparison, it can be found that Bert model can integrate mental features and short text. Bert can better capture mental features to improve the accuracy of classification results. This will help to promote the development of short text classification.

Highlights

  • With the proliferation of online text information, text classification plays a vital role in obtaining information resources [1]

  • bidirectional encoder representations from transformers (Bert) can better capture mental features to improve the accuracy of classification results. is will help to promote the development of short text classification

  • The accuracy of News text content (Text) Method (NTM) results is 0.476 in English Fake News Detection (EFND) and 0.960 in Chinese Topic Detection (CTD), respectively. Both of them are significantly better than Pair Method (PM). e main reason is that PM inputs text and features as separate sequences

Read more

Summary

Introduction

With the proliferation of online text information, text classification plays a vital role in obtaining information resources [1]. As an efficient and well-known natural language processing technology, text classification can identify the content of a given document and find the relationship between document features and document categories. It is widely used in various fields, such as event detection [2,3], media analysis [4, 5], viewpoint mining [6, 7], and predicting product revenue [8,9]. Erefore, to promote content analysis of online text information, a reliable text classification tool is needed [10]. Ese models have good classification results and have been widely used Traditional classification algorithm models include K-nearest neighbor (KNN) [11], naive Bayes (NB) [12], and support vector machine (SVM) [13]. ese models have good classification results and have been widely used

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call