Abstract

Due to the openness and easy accessibility of online social media (OSM), anyone can easily contribute a simple paragraph of text to express their opinion on an article that they have seen. Without access control mechanisms, it has been reported that there are many suspicious messages and accounts spreading across multiple platforms. Accordingly, identifying and labeling fake news is a demanding problem due to the massive amount of heterogeneous content. In essence, the functions of machine learning (ML) and natural language processing (NLP) are to enhance, speed up, and automate the analytical process. Therefore, this unstructured text can be transformed into meaningful data and insights. In this paper, the combination of ML and NLP are implemented to classify fake news based on an open, large and labeled corpus on Twitter. In this case, we compare several state-of-the-art ML and neural network models based on content-only features. To enhance classification performance, before the training process, the term frequency-inverse document frequency (TF-IDF) features were applied in ML training, while word embedding was utilized in neural network training. By implementing ML and NLP methods, all the traditional models have greater than 85% accuracy. All the neural network models have greater than 90% accuracy. From the experiments, we found that the neural network models outperform the traditional ML models by, on average, approximately 6% precision, with all neural network models reaching up to 90% accuracy.

Highlights

  • Fake news is becoming a common term in the human vocabulary [1,2]

  • The neural network models outperform traditional machine learning (ML) models, with all neural network models reaching a greater than 90% accuracy

  • The novel empirical studies from previous works were to classify tagged false news, and natural language processing (NLP) is integrated with classical machine learning to create an artificial intelligence (AI) model

Read more

Summary

Introduction

Fake news is becoming a common term in the human vocabulary [1,2]. Misinformation continues rife with the use of clickbait stories and polarizing videos, usually distributed over OSM and mainstream news [3]. How much the news impacts our lives can be seen with recent world events. From knowing what is going on with the pandemic to the stock market rally, everyone relies heavily on the news. Blogs, and social media messages can come across, intentionally misleadingly for several different reasons. They may seek to manipulate elections or policies; they may be a form of cyber combat between states; they may attempt to increase the popularity and power of someone or undermine their opponents. Perhaps they could only make money to produce ad revenue [4]

Objectives
Findings
Methods
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call