Capturing and Characterizing Human Activities Using Building Locations in America

Ren Ren,Jiang Jiang,Seipel Seipel

doi:10.3390/ijgi8050200

Abstract

Capturing and characterizing collective human activities in a geographic space have become much easier than ever before in the big era. In the past few decades it has been difficult to acquire the spatiotemporal information of human beings. Thanks to the boom in the use of mobile devices integrated with positioning systems and location-based social media data, we can easily acquire the spatial and temporal information of social media users. Previous studies have successfully used street nodes and geo-tagged social media such as Twitter to predict users’ activities. However, whether human activities can be well represented by social media data remains uncertain. On the other hand, buildings or architectures are permanent and reliable representations of human activities collectively through historical footprints. This study aims to use the big data of US building footprints to investigate the reliability of social media users for human activity prediction. We created spatial clusters from 125 million buildings and 1.48 million Twitter points in the US. We further examined and compared the spatial and statistical distribution of clusters at both country and city levels. The result of this study shows that both building and Twitter data spatial clusters show the scaling pattern measured by the scale of spatial clusters, respectively, characterized by the number points inside clusters and the area of clusters. More specifically, at the country level, the statistical distribution of the building spatial clusters fits power law distribution. Inside the four largest cities, the hotspots are power-law-distributed with the power law exponent around 2.0, meaning that they also follow the Zipf’s law. The correlations between the number of buildings and the number of tweets are very plausible, with the r square ranging from 0.53 to 0.74. The high correlation and the similarity of two datasets in terms of spatial and statistical distribution suggest that, although social media users are only a proportion of the entire population, the spatial clusters from geographical big data is a good and accurate representation of overall human activities. This study also indicates that using an improved method for spatial clustering is more suitable for big data analysis than the conventional clustering methods based on Euclidean geometry.

Highlights

Human activities in a geographic space can be characterized by two simple words: When and where
Distinct from previous studies that have predicted human activities using data gathered in a time period covering the whole study area, the present study focuses on the destinations of the human daily lives
The result shows that the collective human activities can be well captured and characterized using social media data and the building footprints are reliable representations of human activities

Summary

Introduction

Human activities in a geographic space can be characterized by two simple words: When and where. Twitter is one of the most popular social media platforms, and millions of tweets are generated daily by more than 140 million active Twitter users [1]. In those tweets, approximately 2 percent of tweets contain precise GPS locations [2] and these geo-tagged tweets can be used to infer users’ activities. -called small data, that is measured by sampling from the whole population, spatial big data is measured and acquired individually, with very precise geo-locations and time stamps, making it possible to acquire more information through big data analysis

Objectives

Findings

Discussion

Conclusion