Abstract

An ability to predict people’s interests in different regions would be valuable to many applications including marketing and policymaking. We posit that social media plays an important role in capturing collective user interests in different regions and their dynamics over time and across regions. Event mentions in microblogs of social media like Twitter not only reflect the people’s interests in different regions but also affect the posting of future messages as the content of microblogs propagates to others through an online social network. Differentiating from the various network analysis techniques that have been developed to capture people’s interests and their propagation patterns, we propose an event mention prediction method that utilises an analysis of inter-region relationships. We first obtain regional user interests for each topic by applying Latent Dirichlet Allocation (LDA) to region-specific collections of tweets and then compute pairwise similarities among regions. The resulting similarity-based region network becomes the basis for constructing region groups through Markov Cluster Algorithm, which helps removing noise relationships among regions. We then propose a relatively simple regression technique to predict future event mentions in different regions. We demonstrate that the proposed method outperforms the state-of-the-art event prediction method, confirming that the novel method of constructing groups from region-based sub-topic interests indeed contributes to the increase in the prediction accuracy.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.