Abstract

In massive Twitter datasets, tweets deriving from different domains, e.g., civil unrest, can be extracted to constitute spatio-temporal Twitter events for spatio-temporal distribution pattern detection. Existing algorithms generally employ scan statistics to detect spatio-temporal hotspots from Twitter events and do not consider the spatio-temporal evolving process of Twitter events. In this paper, a framework is proposed to discover evolving domain related spatio-temporal patterns from Twitter data. Given a target domain, a dynamic query expansion is employed to extract related tweets to form spatio-temporal Twitter events. The new spatial clustering approach proposed here is based on the use of multi-level constrained Delaunay triangulation to capture the spatial distribution patterns of Twitter events. An additional spatio-temporal clustering process is then performed to reveal spatio-temporal clusters and outliers that are evolving into spatial distribution patterns. Extensive experiments on Twitter datasets related to an outbreak of civil unrest in Mexico demonstrate the effectiveness and practicability of the new method. The proposed method will be helpful to accurately predict the spatio-temporal evolution process of Twitter events, which belongs to a deeper geographical analysis of spatio-temporal Big Data.

Highlights

  • Spatio-temporal Big Data has the characteristics of volume, variety, velocity, veracity and value

  • This section describes two steps that are performed on the STTE: (1) Spatial distribution pattern detection; and (2) the discovery of evolving spatio-temporal patterns

  • Step II Discovery of evolving spatio-temporal patterns from STTE: (i) Determine the spatial neighborhoods of each spatial Twitter event and the spatial neighborhoods of each spatio-temporal Twitter event based on δ; (ii) Construct time windows based on ε and determine the temporal neighborhoods of each spatio-temporal Twitter event; (iii) Determine the spatio-temporal neighborhoods of each spatio-temporal Twitter event; and (iv) Extract spatio-temporal connected graphs based on the spatio-temporal proximity relationships and identify spatio-temporal clusters and outliers based on the volume of each spatio-temporal connected graph

Read more

Summary

Introduction

Spatio-temporal Big Data has the characteristics of volume, variety, velocity, veracity and value. The Twitter data has become a kind of spatio-temporal Big Data. Unknown and significant events from the huge mass of Twitter data has . ExtractDdiooenmveaolionfp-rmdeloeanmtetdaoisfnpaartmieoil-nateitnmegdpforTraawml peitwatteoterrrkne:svaienunnTtiwsfieitbdteyfrr.adPmryieonwrakomnrkoicwislqepdurogeperoyissenedoxttporaedqniussicirooevnde:irneFtvhooerlvnitenhwge target domainfr,armeleawteodrkt.weets can be obtained using a dynamic query expansion strategy. These tweets taggedEwxtirtahctgioeno-olof cdaotmioaninanredlattiemdeTiwniftotermr eavteionnts cboynsdtyintuatmeicspqauteiory-temxppanosriaolnT: wFoirttethreevtaerngetts.

Twitter Event Extraction
Motivation
Domain Related Twitter Event Detection
Basic Definitions
Spatio-Temporal Twitter Events
Evolving Spatio-Temporal Patterns Discovery
Spatial Distribution Patterns Detection
Identification and Removal of II-Long Edges
Identification and Removal of III-Long Edges
Determination of Spatial Patterns
Experimental Evaluation and Analysis by Visualization
Dataset and Labels
The Results Obtained by ST-DBSCAN
Comparison with Labels
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call