Abstract
With the continuous development of the Internet, social media based on short text has become popular. However, the sparsity and shortness of essays will restrict the accuracy of text classification. Therefore, based on the Bert model, we capture the mental feature of reviewers and apply them for short text classification to improve its classification accuracy. Specifically, we construct a model text at the language level and fine tune the model to better embed mental features. To verify the accuracy of this method, we compare a variety of machine learning methods, such as support vector machine, convolution neural networks, and recurrent neural networks. The results show the following: (1) Through feature comparison, it is found that mental features can significantly improve the accuracy of short text classification. (2) Combining mental features and text as input vectors can provide more classification accuracy than separating them as two independent vectors. (3) Through model comparison, it can be found that Bert model can integrate mental features and short text. Bert can better capture mental features to improve the accuracy of classification results. This will help to promote the development of short text classification.
Highlights
With the proliferation of online text information, text classification plays a vital role in obtaining information resources [1]
bidirectional encoder representations from transformers (Bert) can better capture mental features to improve the accuracy of classification results. is will help to promote the development of short text classification
The accuracy of News text content (Text) Method (NTM) results is 0.476 in English Fake News Detection (EFND) and 0.960 in Chinese Topic Detection (CTD), respectively. Both of them are significantly better than Pair Method (PM). e main reason is that PM inputs text and features as separate sequences
Summary
With the proliferation of online text information, text classification plays a vital role in obtaining information resources [1]. As an efficient and well-known natural language processing technology, text classification can identify the content of a given document and find the relationship between document features and document categories. It is widely used in various fields, such as event detection [2,3], media analysis [4, 5], viewpoint mining [6, 7], and predicting product revenue [8,9]. Erefore, to promote content analysis of online text information, a reliable text classification tool is needed [10]. Ese models have good classification results and have been widely used Traditional classification algorithm models include K-nearest neighbor (KNN) [11], naive Bayes (NB) [12], and support vector machine (SVM) [13]. ese models have good classification results and have been widely used
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.