Semantic-Aware Visual Abstraction of Large-Scale Social Media Data With Geo-Tags

Zhiguang Zhou,Yuhua Liu,Xiaoyun Zhou,Xinlong Zhang

doi:10.1109/access.2019.2935471

Zhiguang Zhou, Yuhua Liu + Show 2 more

Open Access

PDF Available

https://doi.org/10.1109/access.2019.2935471

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

With the rapid growth of geo-tagged social media data, it has become feasible to explore topics across different areas through text mining and geographical visualization. However, the visual elements of social media data always overlap with each other in the map view, which largely disturbs visual perception of semantic features and their geographical distribution. Thus, it is of great significance to reduce the visual clutter of large-scale social media data, and enhance the visibility of semantic features across local areas. In this paper, we utilize a doc2vec model to transform geo-tagged social media data into high-dimensional vectors, and the semantic correlation can be easily characterized in the dimensionality reduction space. Aiming at the reduction of visual clutter of geographical visualization, a dual-objective blue noise sampling model is proposed to select a subset of social media data, by means of which both the semantic correlation and spatial distribution of large scale social media data are well retained. A rich set of visual designs are implemented enabling users to evaluate the sampled results from multiple perspectives and explore the changes of semantic features across areas, such as heatmap, word cloud and text stream. The effectiveness and validity of the proposed visualization system are further demonstrated through case studies and expert reviews.

Highlights

With a global reach of 500 million tweets transmitted each day, Twitter provides a great deal of information about human life, and gradually becomes a valuable source of large-scale social media data
Inspired by the above visual abstraction methods, we proposed a dual-objective sampling model to simplify the visualization of large scale geo-tagged social media data, in which both the spatial distribution and semantic relationship can be well retained
A set of complex data modeling operations are conducted in an offline preprocessing stage, such as doc2vec, t-SNE and DBSCAN

Summary

Introduction

With a global reach of 500 million tweets transmitted each day, Twitter provides a great deal of information about human life, and gradually becomes a valuable source of large-scale social media data. Xu et al [20] proposed a visual analytics system, allowing users to explore spatiotemporal urban topics in large-scale geo-tagged social media data. Displaying massive geo-tagged social media data can create visual clutters and challenging perceptions, since millions of visual mapping elements greatly overlap with each other in the visualization, making it a difficult task to extract human behaviours and their spatial features.

Objectives

Results

Conclusion