Automatic Fake News Detector in Social Media Using Machine Learning and Natural Language Processing Approaches

P. VaraPrasada Rao,J. Srinivas,K. Venkata Subba Reddy,G. J. Sunny Deol

doi:10.1007/978-981-16-1502-3_30

Abstract

The definition of fake news is a cooked-up story with an objective to fool or to cheat people. The current research aims to detect fake news in social media like Twitter, Watsapp and Facebook by studying the responses of the proposed model on posts acquired from Reddit online news store. Automatic fake news detection is a complex activity as it involves the model to implement natural language processing concepts in-tandem with machine learning approaches. Two feature extraction algorithms, namely CountVectoriser (CV) and term frequency-inverse document frequency (TFIDF), were employed separately for extracting the most relevant features from the dataset. These features were fed to multinomial naive Bayes (MNB), random forest (RF), support vector classifier (SVC) and logistic regression (LR) classifiers for classifying fake news creating a total of eight classification models. A solitary CV-based model was considered as the baseline model for predicting fake news in r/theonion and r/nottheonion datasets. GridsearchCV was also implemented for finding the testing and training scores for the selected parameters. Out of these models, TFIDF with MNB achieved an accuracy of 79.05% and is considered as the best.

Full Text