Abstract

The new communication paradigm established by social media along with its growing popularity in recent years contributed to attract an increasing interest of several research fields. One such research field is the field of event detection in social media. The contribution of this article is to implement a system to detect newsworthy events in Twitter. The proposed pipeline first splits the tweets into segments. These segments are then ranked. The top k segments in this ranking are then grouped together. Finally, the resulting candidate events are filtered in order to retain only those related to real-world newsworthy events. The implemented system was tested with three months of data, representing a total of 4,770,636 tweets written in Portuguese. In terms of performance, the proposed approach achieved an overall precision of 88% and a recall of 38%.

Highlights

  • Social Media services have become a very popular medium of communication and users use these services for various different reasons

  • Both the candidate events computed before and after the filtering step were manually inspected and labeled as being related to real-world newsworthy events or not. This was done in order to obtain Me, the total number of candidate events found by the system prior to the filtering step and considered to be related to real-world newsworthy events and to calculate the number of correct Te and incorrect Fe classifications respectively concerning the real events obtained by the system after the filtering step

  • Candidate events considered to be related to the same real-world event were counted independently in order to simplify the process

Read more

Summary

Introduction

Social Media services have become a very popular medium of communication and users use these services for various different reasons. The popularity and real time nature of these services and the fact that the data generated reflect aspects of real-world societies and is publicly available have attracted the attention of researchers in several fields (Madani, Boussaid, & Zegour, 2014; Nicolaos, Ioannis, & Dimitrios, 2016) One such field is the field of event detection in Social Media. Event detection in Social Media has many potential applications, some of which with significant social impact such as in the detection of natural disasters and to identify and track diseases and epidemics (Madani et al, 2014) Another relevant application can be found in the detection of news topics and events of interest or newsworthy, as real-world events are often discussed by users in these services before they are even reported in traditional Media (Papadopoulos, Corney, & Aiello, 2014; Sakaki, Okazaki, & Matsuo, 2010; Van Canneyt et al, 2014). Tweets are grouped together according to a similarity measure computed using TF-IDF along with a boost factor obtained via the use of a Named Entity Recognizer (NER)

Objectives
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call