Abstract

The content posted by users on Social Networks represents an important source of information for a myriad of applications in the wide field known as `social sensing'. The Twitter platform in particular hosts the thoughts, opinions and comments of its users, expressed in the form of tweets: as a consequence, tweets are often analyzed with text mining and natural language processing techniques for relevant tasks, ranging from brand reputation and sentiment analysis to stance detection. In most cases the intelligent systems designed to accomplish these tasks are based on a classification model that, once trained, is deployed into the data flow for online monitoring. In this work we show how this approach turns out to be inadequate for the task of stance detection from tweets. In fact, the sequence of tweets that are collected everyday represents a data stream. As it is well known in the literature on data stream mining, classification models may suffer from concept drift, i.e. a change in the data distribution can potentially degrade the performance. We present a broad experimental campaign for the case study of the online monitoring of the stance expressed on Twitter about the vaccination topic in Italy. We compare different learning schemes and propose yet a novel one, aimed at addressing the event-driven concept drift.

Highlights

  • Nowadays, millions of users mention and comment real world events by posting short messages, i.e. tweets, on the wellknown Twitter platform

  • We formalize the notion of concept drift and report the most relevant adaptive solutions that have been proposed for text stream classification

  • Possible explanations may lie in the occurrence of concept drift and the intrinsic complexity associated with certain events or deriving from the annotation procedure

Read more

Summary

Introduction

Millions of users mention and comment real world events by posting short messages, i.e. tweets, on the wellknown Twitter platform. The nature of the contents posted on Twitter, i.e. the popular status update messages dubbed tweets, makes the platform suitable for stance detection studies. New words and hashtags continuously appear and become popular; irony, sarcasm or ambiguity are frequently present in messages, and Favor and Against stances can be expressed with both sentiment polarities. All these aspects, combined with the limited length of tweets, make Twitter a harsh environment for automatic analysis of text messages

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call