Abstract

Nowadays, current events related to diverse fields are published in newspapers, shared on social media and broadcasted on radio and television. The explosive growth in online news contents has made it very difficult to discriminate between real and fake. As a result, fake news has become prevalent and immensely challenging to analyze and verify. Indeed, it is a big challenge to the government and public to debate the situation depending on case to case. For this purpose, a mechanism has to be taken on fact-checking rumors and statements particularly those that get thousands of views and likes before being debunked and refuted by expert sources. Various machine learning techniques have been used to detect and classify fake news. However, these approaches are restricted in terms of accuracy. This study has applied a random forest (RF) classifier to predict fake or real news. For this purpose, twenty-three (23) textual features are extracted from ISOT Fake News Dataset. Four best feature selection techniques like chi2, univariate, information gain and feature importance are used to select fourteen best features out of twenty-three. The proposed model and other benchmark techniques are evaluated on benchmark dataset using best features. Experimental findings show that the proposed model outperformed state-of-the-art machine learning techniques such as GBM, XGBoost and Ada Boost Regression Model in terms of classification accuracy.

Highlights

  • The internet offers many possibilities along with many challenges when it comes to reporting the news

  • The above results shows that Random forest attained the highest accuracy score of 97.33%, Gradient Boost obtained the second highest accuracy of 96.27 percent, Extra Gradient Boost accuracy is 95.73 percent, K-Nearest neighbors (KNN) accuracy is 95.25 percent, Decision Tree accuracy is 93.07, MLP accuracy is 92.64 percent and Logistic Regression got the lowest accuracy of 45.54 percent

  • The results demonstrate that the proposed model performed better than individual other models by attaining accuracy score of 97.27%. on best features selected using Univariate features selection technique

Read more

Summary

Introduction

The internet offers many possibilities along with many challenges when it comes to reporting the news. In addition to conventional channels such as newspapers and TV channels, news communication channels such as blogs and social networks have arisen since the internet became a spreading source. These social media sites are highly effective and valuable if this change has a positive aspect on one-hand while negative aspect on the other hand such as fake and inaccurate news because editorial boards do not necessarily determine the trustworthiness of the information posted. These social media platforms are useful to share ideas and discuss issues such as governance, education and health. Organizations widely use most sites for their monetary benefits for their objectives

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call