Feature-Based Comparative Study of Machine Learning Algorithms for Credibility Analysis of Online Social Media Content

Utkarsh Sharma,Shishir Kumar

doi:10.1007/978-981-16-2641-8_2

Abstract

AbstractAs the use of social media is growing these days in context of information sharing, the necessity for reality check is also the need of the hour. As more of the users rely on the news spread on these social media platforms, more are the chances of spreading rumours through these contents. To avoid the escalation of hoaxes, the content must be checked and properly labelled as fact or fake by human perception or by using widely used classification algorithms. In this paper, we are performing a feature-based survey of four popular machine learning-based algorithms for the classification of content on social media as credible or fake. The platform which we will consider for data collection will be Twitter API, which is freely distributed by Twitter. We propose the feature selection by using ant colony optimization (ACO) algorithm. The algorithms we consider for comparison are decision tree, Naïve Bayes, random forest and SVM. All these algorithms are tested on some predefined features of tweets which are provided by the Twitter API, and also, some additional features are also extracted by feature engineering on data. We provide our findings on the basis of accuracy, precision and recall measures calculated for all of the algorithms. We also applied cross-validation with tenfold on these algorithms to get the appropriate measure of accuracy.KeywordsMachine learningOSNDecision treeRandom forestNaïve BayesSVMACO

Full Text