Abstract

Popular real-world events often create huge traffic on Twitter including real-time updates of important moments, personal comments, and so on while the event is happening. Most of the users are interested to read the important tweets that possibly include important moments of that event. However, extracting the relevant tweets of any event is a challenging task due to the endless stream of noisy tweets and vocabulary variation problem of social media content. To handle these challenges, the authors introduce a new approach for computing the relative tweet importance based on the concept of the Pagerank algorithm where adjacency matrix of the graph representation of tweets contains semantic similarity matrix based on the word mover's distance measure utilizing Word2Vec word embedding model. The results show that top-ranked tweets generated by the proposed approach are more concise and news-worthy than baseline approaches.

Highlights

  • Online social media networks have become a rich source of news distribution about real-world events of all kinds

  • While there exist an extensive research on ranking for web search (Brin and Page (1998), Agichtein et al (2006), Xiang et al (2010), Aggarwal (2010)), there is little work done for ranking of tweets that generate the need to develop an efficient Tweets ranking model with the following goals: Relevance: The top-ranked tweets constituting the summary of the event must be relevant to the specific event and contain some important information that can be used in event analysis

  • Resultant top-ranked tweets of proposed approach for the first time-window of second dataset is presented in Figure 2 and compared approaches are presented in Figure 3, 4, 5, and 6 respectively

Read more

Summary

Introduction

Online social media networks have become a rich source of news distribution about real-world events of all kinds. Victory of Narendra Modi as a Prime Minister of India in general election 2019 induced millions of tweets on result declaration night In such sea of tweets on a topic or related to any event, ranking has become an important issue in Twitter not just in Web search. Most of the existing Twitter ranking algorithms are based on traditional text ranking approaches suitable for traditional news text data These algorithms suffer various challenges of Twitter data like high volume data and the use of informal language. Xu et al (2013) extend the Textrank algorithm to make suitable for Twitter data They extracted bi-grams instead of uni-grams from the tweets to make nodes of a graph where edges represent the co-occurrence of bi-grams within fixed time-window. A lexical graph is built and high scored lexical units are included in the summaries that are the main discussion points of the tweets

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call