Abstract

With the rapid growth of geo-tagged social media data, it has become feasible to explore topics across different areas through text mining and geographical visualization. However, the visual elements of social media data always overlap with each other in the map view, which largely disturbs visual perception of semantic features and their geographical distribution. Thus, it is of great significance to reduce the visual clutter of large-scale social media data, and enhance the visibility of semantic features across local areas. In this paper, we utilize a doc2vec model to transform geo-tagged social media data into high-dimensional vectors, and the semantic correlation can be easily characterized in the dimensionality reduction space. Aiming at the reduction of visual clutter of geographical visualization, a dual-objective blue noise sampling model is proposed to select a subset of social media data, by means of which both the semantic correlation and spatial distribution of large scale social media data are well retained. A rich set of visual designs are implemented enabling users to evaluate the sampled results from multiple perspectives and explore the changes of semantic features across areas, such as heatmap, word cloud and text stream. The effectiveness and validity of the proposed visualization system are further demonstrated through case studies and expert reviews.

Highlights

  • With a global reach of 500 million tweets transmitted each day, Twitter provides a great deal of information about human life, and gradually becomes a valuable source of large-scale social media data

  • Inspired by the above visual abstraction methods, we proposed a dual-objective sampling model to simplify the visualization of large scale geo-tagged social media data, in which both the spatial distribution and semantic relationship can be well retained

  • A set of complex data modeling operations are conducted in an offline preprocessing stage, such as doc2vec, t-SNE and DBSCAN

Read more

Summary

Introduction

With a global reach of 500 million tweets transmitted each day, Twitter provides a great deal of information about human life, and gradually becomes a valuable source of large-scale social media data. Xu et al [20] proposed a visual analytics system, allowing users to explore spatiotemporal urban topics in large-scale geo-tagged social media data. Displaying massive geo-tagged social media data can create visual clutters and challenging perceptions, since millions of visual mapping elements greatly overlap with each other in the visualization, making it a difficult task to extract human behaviours and their spatial features.

Objectives
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.