Abstract

A key challenge in mining social media data streams is to identify events which are actively discussed by a group of people in a specific local or global area. Such events are useful for early warning for accident, protest, election or breaking news. However, neither the list of events nor the resolution of both event time and space is fixed or known beforehand. In this work, we propose an online spatio-temporal event detection system using social media that is able to detect events at different time and space resolutions. First, to address the challenge related to the unknown spatial resolution of events, a quad-tree method is exploited in order to split the geographical space into multiscale regions based on the density of social media data. Then, a statistical unsupervised approach is performed that involves Poisson distribution and a smoothing method for highlighting regions with unexpected density of social posts. Further, event duration is precisely estimated by merging events happening in the same region at consecutive time intervals. A post processing stage is introduced to filter out events that are spam, fake or wrong. Finally, we incorporate simple semantics by using social media entities to assess the integrity, and accuracy of detected events. The proposed method is evaluated using different social media datasets: Twitter and Flickr for different cities: Melbourne, London, Paris and New York. To verify the effectiveness of the proposed method, we compare our results with two baseline algorithms based on fixed split of geographical space and clustering method. For performance evaluation, we manually compute recall and precision. We also propose a new quality measure named strength index, which automatically measures how accurate the reported event is.

Highlights

  • With the ubiquitous nature of smart phones, Twitter and other social media services, are frequently used as a source of news and other information [1,2,3]

  • We propose a novel approach to online spatio-temporal event detection that utilizes: (i) a quad-tree and Poisson model variant to dynamically identify events across different spatial scales; and (ii) a smoothing and filtering approach to effectively detect events with different temporal resolutions

  • Using the tweets over a period of one-year we evaluate our algorithm based on the precision, recall and strength index as statistical metrics Section "Case study: Twitter dataset"

Read more

Summary

Introduction

With the ubiquitous nature of smart phones, Twitter and other social media services, are frequently used as a source of news and other information [1,2,3]. People use such social media to share news and photos about various events they may encounter in their daily lives, oftentimes in real-time as these events unfold. Due to real-time sharing by people, social media serves as an efficient source of breaking news compared to traditional media, which are either slow to pick up such information or do not give a complete and accurate picture of the news and events. One emerging use case of significant importance is where social media information is used for real-time event detection.

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call