Can Fake News Detection Models Maintain the Performance through Time? A Longitudinal Evaluation of Twitter Publications

Nuno Guimarães,Luís Torgo,Álvaro Figueira

doi:10.3390/math9222988

Abstract

The negative impact of false information on social networks is rapidly growing. Current research on the topic focused on the detection of fake news in a particular context or event (such as elections) or using data from a short period of time. Therefore, an evaluation of the current proposals in a long-term scenario where the topics discussed may change is lacking. In this work, we deviate from current approaches to the problem and instead focus on a longitudinal evaluation using social network publications spanning an 18-month period. We evaluate different combinations of features and supervised models in a long-term scenario where the training and testing data are ordered chronologically, and thus the robustness and stability of the models can be evaluated through time. We experimented with 3 different scenarios where the models are trained with 15-, 30-, and 60-day data periods. The results show that detection models trained with word-embedding features are the ones that perform better and are less likely to be affected by the change of topics (for example, the rise of COVID-19 conspiracy theories). Furthermore, the additional days of training data also increase the performance of the best feature/model combinations, although not very significantly (around 2%). The results presented in this paper build the foundations towards a more pragmatic approach to the evaluation of fake news detection models in social networks.

Highlights

Online Social Networks (OSNs) redefined the way we communicate
We focus on developing fake news detection models in social networks and evaluate their performance in a long-term period
Our main goal in this paper is to evaluate the feature importance and performance of fake news detection models in a more pragmatic scenario, where features and models are evaluated using tweets chronologically ordered

Summary

Introduction

Online Social Networks (OSNs) redefined the way we communicate. Since their inception, they evolved from a way to share media and information among small friends networks to an entire medium to consume and share content. In the early days of OSNs, these actors were responsible for the propagation of spam These actors, or as we may more properly name them, malicious accounts, are focusing on the propagation of false or extremely biased information with the main objective of influencing users’ perception on topics such as politics and health. This content is often known as “fake news” and its effects already affected real-world events, such as elections and health-related topics, namely, conspiracies regarding the new coronavirus (COVID-19) pandemic. This rumour circulated worldwide and was reported by a television newscast https://www.nytimes.com/2018

Objectives

Results

Discussion

Conclusion