Abstract

The advent of the World Wide Web and the rapid adoption of social media platforms (such as Facebook and Twitter) paved the way for information dissemination that has never been witnessed in the human history before. With the current usage of social media platforms, consumers are creating and sharing more information than ever before, some of which are misleading with no relevance to reality. Automated classification of a text article as misinformation or disinformation is a challenging task. Even an expert in a particular domain has to explore multiple aspects before giving a verdict on the truthfulness of an article. In this work, we propose to use machine learning ensemble approach for automated classification of news articles. Our study explores different textual properties that can be used to distinguish fake contents from real. By using those properties, we train a combination of different machine learning algorithms using various ensemble methods and evaluate their performance on 4 real world datasets. Experimental evaluation confirms the superior performance of our proposed ensemble learner approach in comparison to individual learners.

Highlights

  • Introduction e advent of the World Wide Web and the rapid adoption of social media platforms paved the way for information dissemination that has never been witnessed in the human history before

  • It is evident that the maximum accuracy achieved on DS1 (ISOT Fake News Dataset) is 99%, achieved by random forest algorithm and Perez-linear support vector machine (LSVM)

  • Linear support vector machine (SVM), multilayer perceptron, bagging classifiers, and boosting classifiers achieved an accuracy of 98%. e average accuracy attained by ensemble learners is 97.67% on DS1, whereas the corresponding average for individual learners is 95.25%. e absolute difference between individual learners and ensemble learners is 2.42% which is not significant

Read more

Summary

Research Article Fake News Detection Using Machine Learning Ensemble Methods

Received 4 September 2020; Revised 14 September 2020; Accepted September 2020; Published October 2020. Our study explores different textual properties that can be used to distinguish fake contents from real By using those properties, we train a combination of different machine learning algorithms using various ensemble methods and evaluate their performance on 4 real world datasets. Ere has been a rapid increase in the spread of fake news in the last decade, most prominently observed in the 2016 US elections [5] Such proliferation of sharing articles online that do not conform to facts has led to many problems not just limited to politics but covering various other domains such as sports, health, and science [3]. Our study explores different textual properties that could be used to distinguish fake contents from real By using those properties, we train a combination of different machine learning algorithms using various ensemble methods that are not thoroughly explored in the current literature. We conducted extensive experiments on 4 real world publicly available datasets. e results validate the improved performance of our proposed technique using the 4 commonly used performance metrics (namely, accuracy, precision, recall, and F-1 score)

Materials and Methods
User query Trained model
Minkowski distance
Predicted false
Ensemble learners
Results and Discussion
Logistic Linear SVM Multilayer K nearest Random Voting
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call