Big Data Integration in Remote Sensing across a Distributed Metadata-Based Spatial Infrastructure

Junqing Fan,Lizhe Wang,Yan Ma,Jining Yan

doi:10.3390/rs10010007

Abstract

Since Landsat-1 first started to deliver volumes of pixels in 1972, the volumes of archived data in remote sensing data centers have increased continuously. Due to various satellite orbit parameters and the specifications of different sensors, the storage formats, projections, spatial resolutions, and revisit periods of these archived data are vastly different. In addition, the remote sensing data received continuously by each data center arrives at a faster code rate; it is best to ingest and archive the newly received data to ensure users have access to the latest data retrieval and distribution services. Hence, an excellent data integration, organization, and management program is urgently needed. However, the multi-source, massive, heterogeneous, and distributed storage features of remote sensing data have not only caused difficulties for integration across distributed data center spatial infrastructures, but have also resulted in the current modes of data organization and management being unable meet the rapid retrieval and access requirements of users. Hence, this paper proposes an object-oriented data technology (OODT) and SolrCloud-based remote sensing data integration and management framework across a distributed data center spatial infrastructure. In this framework, all of the remote sensing metadata in the distributed sub-centers are transformed into the International Standardization Organization (ISO) 19115-based unified format, and then ingested and transferred to the main center by OODT components, continuously or at regular intervals. In the main data center, in order to improve the efficiency of massive data retrieval, we proposed a logical segmentation indexing (LSI) model-based data organization approach, and took SolrCloud to realize the distributed index and retrieval of massive metadata. Finally, a series of distributed data integration, retrieval, and comparative experiments showed that our proposed distributed data integration and management program is effective and promises superior results. Specifically, the LSI model-based data organization and the SolrCloud-based distributed indexing schema was able to effectively improve the efficiency of massive data retrieval.

Highlights

Since Landsat-1 first started to deliver volumes of pixels in 1972, the amount of archived remote sensing data stored by data centers has increased continuously [1,2]
Case 1: The spatial and time query parameters remained. In this case: (a) when the amount of metadata was less than 7.5 million items, the time consumption of the logical segmentation indexing (LSI) model-based retrieval method was a little less than that of longitude- and latitude-based data retrieval; (b) with the increase of the metadata volume, the LSI model-based data retrieval was more efficient than the longitude- and latitude-based data retrieval; (c) when the amount of metadata was less than 5.5 million items, the time consumption of LSI model-based metadata retrieval on a single Solr node was not very different from that of SolrCloud; (d) when the metadata volume increased, the retrieval speed differences between
Case 2: The spatial query parameters remained but time frames changed. In this case: (a) with the increase of query time frames, the time consumed showed an upward trend as a whole, but this was not obvious, for SolrCloud and in the Solr single node—this type of situation could benefit from the inverted index of SolrCloud and Solr; and (b) the query time increased little with the increase of query time frames in the HBase cluster

Summary

Introduction

Since Landsat-1 first started to deliver volumes of pixels in 1972, the amount of archived remote sensing data stored by data centers has increased continuously [1,2]. The two most widely used data organization models are: (1) spatio-temporal recording system-based satellite orbit stripes or scene organization; and (2) globally meshed grid-based data tiling organization [8] The former has obvious shortcomings for massive data retrieval and quick access; and the latter causes an increase by about one-third in the amount of data due to image segmentation, requiring larger data storage spaces. LSI model takes the logical segmentation indexing code as the identifier of each remote sensing data, rather than performing an actual physical subdivision This increases the efficiency of data retrieval with the help of the global subdivision index, and avoids generating numerous small files caused by the physical subdivision of data. This paper is organized as follows: Section 2 provides an overview of the background knowledge and related work; Section 3 describes the distributed multi-source remote sensing metadata transformation and integration; Section 4 details the data management methods, including the LSI spatial organization model, full-text index construction, and distributed data retrieval; Section 5 introduces the experiments and provides an analysis of the proposed program; and Section 6 provides a summary and conclusions

Distributed Integration of Remote Sensing Data

Spatial Organization of Remote Sensing Data

OODT: A Data Integration Framework

Distributed Integration of Multi-Source Remote Sensing Data

The ISO 19115-Based Metadata Transformation

Distributed Multi-Source Remote Sensing Data Integration

Spatial Organization and Management of Remote Sensing Data

LSI Organization Model of Multi-Source Remote Sensing Data

Full-Text Index of Multi-Sourced Remote Sensing Metadata

Distributed Data Retrieval

Experiment and Analysis

Distributed Data Integration Experiment

LSI Model-Based Metadata Retrieval Experiment

GeoSOT Grids

Comparative Experiments and Analysis

Comparative Experiments

Results Analysis

Conclusions

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Remote Sensing	Publication Date: Dec 21, 2017
Citations: 30	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Big Data Integration in Remote Sensing across a Distributed Metadata-Based Spatial Infrastructure

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Remote Sensing

Lead the way for us

Similar Papers

Big Earth Observation Data Integration in Remote Sensing Based on a Distributed Spatial Framework
Yinyi Cheng ... Kefa Zhou
Remote Sensing | VOL. 12
Yinyi Cheng, et. al.Yinyi Cheng ... Kefa Zhou
17 Mar 2020
Remote Sensing | VOL. 12

Mitigating Curtailment and Carbon Emissions through Load Migration between Data Centers
Jiajia Zheng ... Sangwon Suh
Joule | VOL. 4
Jiajia Zheng, et. al.Jiajia Zheng ... Sangwon Suh
25 Aug 2020
Joule | VOL. 4

УПРАВЛІННЯ ТА ІНТЕГРАЦІЯ ДАНИХ В УМОВАХ ЦИФРОВІЗАЦІЇ ЕКОНОМІЧНИХ ПРОЦЕСІВ: ВИКЛИКИ ТА ПЕРСПЕКТИВИ
Nataliia Kasyanova ... Vladyslav Okhrimenko
Economical | VOL. 1
Nataliia Kasyanova, et. al.Nataliia Kasyanova ... Vladyslav Okhrimenko
01 Jan 2023
Economical | VOL. 1

DATA MANAGEMENT AND INTEGRATION IN THE CONTEXT OF DIGITALIZATION OF ECONOMIC PROCESSES: CHALLENGES AND PROSPECTS
Nataliia Kasyanova ... Vladyslav Okhrimenko
Economical | VOL. 1
Nataliia Kasyanova, et. al.Nataliia Kasyanova ... Vladyslav Okhrimenko
01 Jan 2023
Economical | VOL. 1

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Big Data Integration in Remote Sensing across a Distributed Metadata-Based Spatial Infrastructure

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Remote Sensing