Abstract
Tracking how discussion topics evolve in social media and where these topics are discussed geographically over time has the potential to provide useful information for many different purposes. In crisis management, knowing a specific topic's current geographical location could provide vital information to where, or even which, resources should be allocated. This paper describes an attempt to track online discussions geographically over time. A distributed geo-aware streaming latent Dirichlet allocation model was developed for the purpose of recognizing topics' locations in unstructured text. To evaluate the model it has been implemented and used for automatic discovery and geographical tracking of election topics during parts of the 2016 American presidential primary elections. It was shown that the locations correlated with the actual election locations, and that the model provides a better geolocation classification compared to using a keyword-based approach.
Highlights
The rapid growth of social media has enabled millions of people to express themselves and reach wide audiences
Topic models and latent Dirichlet allocation (LDA) Topic models assume that the content of documents is governed by abstract topics which are hidden
A prototypical tool is provided with which the topics and trending locations of streaming media can be automatically discovered
Summary
The rapid growth of social media has enabled millions of people to express themselves and reach wide audiences. The geo-positioning methods have employed a number of indirect indicators such as looking at the social network, the style of writing (choice of word, etc.), or explicit location references given in the text content [3,4,5,6]. Combining these types of location indicators with metadata such as timezone, language, and even posting time, has resulted in fairly good localization of both user residence and tweet post location [7,8]. The results of these efforts usually do not differentiate between the account holder’s current or past locations, and they do not focus on the topical correlation
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.