The popularity of contemporary social media (SM) has impacted democratic practices, and the success of presidential campaigns is frequently attributed to SM performance. Within this new scenario, many methodological proposals that use SM data have been put forward for predicting election results. However, the most common approach, based on the volume and sentiment analysis of mentions on Twitter, has been frequently criticized and challenged. Thus, recent surveys have indicated new directions, such as the use of data from more than one SM platform, the adoption of nonlinear machine learning (ML) models, and the validation of methodologies and experiments in different elections. In this context, the present paper proposes SoMEN, the Social Media framework for Election Nowcasting, a framework composed of a process and an ML model for nowcasting election results based on SM performance as features and with offline polls as labeled data. It also defines SoMEN-DC, an execution strategy for SoMEN that enables continuous prediction during the campaign (DC). The proposed metrics and framework were applied to some of the main recent presidential elections in Latin America: Argentina (2019), Brazil (2018), Colombia (2018), and Mexico (2018). More than 65,000 posts were collected from the SM profiles of candidates on Facebook, Twitter, and Instagram with data from 195 presidential polls. Results have demonstrated that it was possible to achieve a high level of accuracy in predicting the final vote share of the candidates and to make daily predictions, providing competitive or better results than the traditional polls. The strategies put forward in this study have attempted to address several of the current challenges in this research area and have indicated a new manner of how to face the problems. They may also be directly used for predicting future elections in similar scenarios.
Read full abstract