Abstract

SummaryAs a new form of social media, microblogs (e.g., Twitter and Weibo) are playing an important role in people's daily life. With the rise in popularity and size of microblogs, there is a need for distributed approaches that can detect bursty event with low latency from the short‐text data stream. In this paper, we propose a distributed and incremental temporal topic model for microblogs called Bursty Event dEtection (BEE+). BEE+ is able to detect bursty events from short‐text dataset and model the temporal information. And BEE+ processes the post‐stream incrementally to track the topic drifting of events over time. Therefore, the latent semantic indices are preserved from one time period to the next. In order to achieve real‐time processing, we design a distributed execution framework based on Spark engine. To verify its ability to detect bursty event, we conduct experiments on a Weibo dataset of 6,360,125 posts. The results show that BEE+ can outperform the baselines for detecting the meaningful bursty events and track the topic drifting. Copyright © 2015 John Wiley & Sons, Ltd.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call