Abstract

In an age when people are predisposed to report real-world events through their social media accounts, many researchers value the benefits of mining user generated content from social media. Compared with the traditional news media, social media services, such as Twitter, can provide more complete and timely information about the real-world events. However events are often like a puzzle and in order to solve the puzzle/understand the event, we must identify all the sub-events or pieces. Existing Twitter event monitoring systems for sub-event detection and summarization currently typically analyse events based on partial data as conventional data collection methodologies are unable to collect comprehensive event data. This results in existing systems often being unable to report sub-events in real-time and often in completely missing sub-events or pieces in the broader event puzzle. This paper proposes a Sub-event detection by real-TIme Microblog monitoring (STRIM) framework that leverages the temporal feature of an expanded set of news-worthy event content. In order to more comprehensively and accurately identify sub-events this framework first proposes the use of adaptive microblog crawling. Our adaptive microblog crawler is capable of increasing the coverage of events while minimizing the amount of non-relevant content. We then propose a stream division methodology that can be accomplished in real time so that the temporal features of the expanded event streams can be analysed by a burst detection algorithm. In the final steps of the framework, the content features are extracted from each divided stream and recombined to provide a final summarization of the sub-events. The proposed framework is evaluated against traditional event detection using event recall and event precision metrics. Results show that improving the quality and coverage of event contents contribute to better event detection by identifying additional valid sub-events. The novel combination of our proposed adaptive crawler and our stream division/recombination technique provides significant gains in event recall (44.44%) and event precision (9.57%). The addition of these sub-events or pieces, allows us to get closer to solving the event puzzle.

Highlights

  • Since the emergence of Web 2.0, the way people engage with news events has been fundamentally redefined

  • To maximise the utilisation of the extra event content identified by the adaptive crawler, our experiments show that the burst detection algorithm should be applied to the adaptive stream, and to the extra-only stream

  • A peak window that is summarised by a tweet “I think I'd be quite into Glastonbury if I was some kind of predator or serial killer’’ in the Glastonbury Festival datasets is not considered a subevent whereas “Moves like Jagger is something a careworker writes on a physiotherapy report #bbcglasto.”

Read more

Summary

Introduction

Since the emergence of Web 2.0, the way people engage with news events has been fundamentally redefined. Instead of passively consuming online news, the general public is actively involved in reporting and commenting on different kinds of news events. They post observations and express their opinions through social media services such as Twitter. Recent research has shown that Twitter leads the traditional online newswires by reporting sports and disaster events more efficiently [19], and by providing broader coverage of event information [14] along with additional viewpoints [20, 21].

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call