Abstract

Sina Weibo has become an increasingly critical social media in China for sharing latest news, marketing new products, and discussing controversial issues. The rising importance of Sina Weibo on the society makes it very important to understand “what”, “when”, “who” on hot topics that are being continuously tweeted and searched by millions of active users. In this paper, we develop a systematic approach to characterize temporal distribution of hot topics searched by Sina Weibo users over a four-month time-span and to uncover correlated hot topics that are not only tweeted by the same users, but also appear in the similar set of tweet messages. We analyze real-time Sina Weibo tweet data streams and study volume correlations and temporal gaps between user searches and tweeting activities on hot topics. In addition, we examine the correlations between hot topic searches on social media and on search engines to understand hot topics and user behaviors across different platforms. Given the challenges of analyzing massive amount of tweet data, we explore Hadoop MapReduce framework to effectively process millions of tweets from the collected data-sets, and quantify the performance benefits of MapReduce on analyzing tweet streams. To the best of our knowledge, this paper is the first effort to characterize temporal search patterns of hot topics on Sina Weibo and to study their correlations with tweeting data streams as well as search engine statistics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call