Abstract

Online social media networks are gaining attention worldwide, with an increasing number of people relying on them to connect, communicate and share their daily pertinent event-related information. Event detection is now increasingly leveraging online social networks for highlighting events happening around the world via the Internet of People. In this paper, a novel Event Detection model based on Scoring and Word Embedding (ED-SWE) is proposed for discovering key events from a large volume of data streams of tweets and for generating an event summary using keywords and top-k tweets. The proposed ED-SWE model can distill high-quality tweets, reduce the negative impact of the advent of spam, and identify latent events in the data streams automatically. Moreover, a word embedding algorithm is used to learn a real-valued vector representation for a predefined fixed-sized vocabulary from a corpus of Twitter data. In order to further improve the performance of the Expectation-Maximization (EM) iteration algorithm, a novel initialization method based on the authority values of the tweets is also proposed in this paper to detect live events efficiently and precisely. Finally, a novel automatic identification method based on the cosine measure is used to automatically evaluate whether a given topic can form a live event. Experiments conducted on a real-world dataset demonstrate that the ED-SWE model exhibits better efficiency and accuracy than several state-of-art event detection models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call