Solving the Fragment Complexity of Official, Social, and Sensorial Urban Data

Hui Liu,Jingqing Jiang,Jie Song,Yaowei Hou,Mohammad Swapan

doi:10.1155/2020/8914757

Hui Liu, Jingqing Jiang + Show 3 more

Open Access

https://doi.org/10.1155/2020/8914757

Copy DOI

Abstract

Cities in the big data era hold the massive urban data to create valuable information and digitally enhanced services. Sources of urban data are generally categorized as one of the three types: official, social, and sensorial, which are from the government and enterprises, social networks of citizens, and the sensor network. These types typically differ significantly from each other but are consolidated together for the smart urban services. Based on the sophisticated consolidation approaches, we argue that a new challenge, fragment complexity that represents a well-integrated data has appropriate but fragmentary schema and difficult to be queried, is ignored in the state-of-art urban data management. Comparing with predefined and rigid schema, fragmentary schema means a dataset contains millions of attributes but nonorthogonally distributed among tables, and of course, values of these attributes are even massive. As far as a query is concerned, locating where these attributes are being stored is the first encountered problem, while traditional value-based query optimization has no contributions. To address this problem, we propose an index on massive attributes as an attributes-oriented optimization, namely, attribute index. Attribute index is a secondary index for locating files in which the target attributes are stored. It contains three parts: ATree for searching keys, DTree for locating keys among files, and ADLinks as a mapping table between ATree and DTree. In this paper, the index architecture, logical structure and algorithms, the implementation details, the creation process, the integration to the existing key-value store, and the urban application scenario are described. Experiments show that, in comparison with B + -Tree, LSM-Tree, and AVL-Tree, the query time of ATree is 1.1x, 1.5x, and 1.2x faster, respectively. Finally, we integrate our proposition with HBase, namely, UrbanBase, whose query performance is 1.3x faster than the original HBase.

Highlights

Urban big data are a large amount of dynamic and static data generated from subjects and objects including various urban facilities, organizations, and individuals
Social urban data refer to the data generated by urban residents in their daily lives, such as social media usage records and global positioning system (GPS) data generated by user activities
For the applications of urban big data management, this paper proposes an attribute-value model to represent all types of them

Summary

Introduction

Urban big data are a large amount of dynamic and static data generated from subjects and objects including various urban facilities, organizations, and individuals. With the continuous development and maturity of mobile Internet and big data technologies, the types and scale of urban data have increased significantly [1]. Sources of urban data are generally categorized as one of the three types: official, social, and sensorial. Sensorial data refer to the sensor data on urban infrastructure and moving objects, such as historical and real-time data recorded by sensor systems of environment, water, transportation, gas and buildings, and pictures and video taken by surveillance cameras. E urban data constituted by these three sources are diverse and large in scale, covering all aspects of urban production and life [5] Sensorial data refer to the sensor data on urban infrastructure and moving objects, such as historical and real-time data recorded by sensor systems of environment, water, transportation, gas and buildings, and pictures and video taken by surveillance cameras. e urban data constituted by these three sources are diverse and large in scale, covering all aspects of urban production and life [5]

Objectives

Methods

Results

Conclusion