Behind the cues: A benchmarking study for fake news detection

Georgios Gravanis,Athena Vakali,Konstantinos Diamantaras,Panagiotis Karadais

doi:10.1016/j.eswa.2019.03.036

Abstract

Fake news has become a problem of great impact in our information driven society because of the continuous and intense fakesters content distribution. Information quality in news feeds is under questionable veracity calling for automated tools to detect fake news articles. Due to many faces of fakesters, creating such tool is a challenging problem. In this work, we propose a model for fake news detection using content based features and Machine Learning (ML) algorithms. To conclude in most accurate model we evaluate several feature sets proposed for deception detection and word embeddings as well. Moreover, we test the most popular ML classifiers and investigate the possible improvement reached under ensemble ML methods such as AdaBoost and Bagging. An extensive set of earlier data sources has been used for experimentation and evaluation of both feature sets and ML classifiers. Moreover, we introduce a new text corpus, the “UNBiased” (UNB) dataset, which integrates various news sources and fulfills several standards and rules to avoid biased results in classification task. Our experimental results show that the use of an enhanced linguistic feature set with word embeddings along with ensemble algorithms and Support Vector Machines (SVMs) is capable to classify fake news with high accuracy.

Full Text