A HADOOP-BASED DISTRIBUTED FRAMEWORK FOR EFFICIENT MANAGING AND PROCESSING BIG REMOTE SENSING IMAGES

C Wang,W Wen,X Hu,F Hu,C Yang,S Zhao

doi:10.5194/isprsannals-ii-4-w2-63-2015

Abstract

Abstract. Various sensors from airborne and satellite platforms are producing large volumes of remote sensing images for mapping, environmental monitoring, disaster management, military intelligence, and others. However, it is challenging to efficiently storage, query and process such big data due to the data- and computing- intensive issues. In this paper, a Hadoop-based framework is proposed to manage and process the big remote sensing data in a distributed and parallel manner. Especially, remote sensing data can be directly fetched from other data platforms into the Hadoop Distributed File System (HDFS). The Orfeo toolbox, a ready-to-use tool for large image processing, is integrated into MapReduce to provide affluent image processing operations. With the integration of HDFS, Orfeo toolbox and MapReduce, these remote sensing images can be directly processed in parallel in a scalable computing environment. The experiment results show that the proposed framework can efficiently manage and process such big remote sensing data.

Highlights

Big Data, referring to the enormous volume, velocity, and variety of data (NIST Cloud/BigData Workshop, 2014), has become one of the biggest technology shifts in in the 21st century (Mayer-Schönberger and Cukier, 2013)
The RS image processing reads these data into memory first for further analysis, so the data I/O has become the bottleneck for high-performance computing (HPC) to process RS images
Hadoop is an open-source software framework for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware. It is composed of Hadoop Common, Hadoop Distributed File System, Hadoop YARN and Hadoop MapReduce

Summary

1.! INTRODUCTION

Big Data, referring to the enormous volume, velocity, and variety of data (NIST Cloud/BigData Workshop, 2014), has become one of the biggest technology shifts in in the 21st century (Mayer-Schönberger and Cukier, 2013). Hadoop is an open-source software framework for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware. It is composed of Hadoop Common, Hadoop Distributed File System, Hadoop YARN and Hadoop MapReduce. To address the challenges posed by processing big RS data, this paper proposes a Hadoop-based distributed framework to efficiently manage and process big RS image data. This framework distributes RS images among the nodes in a cluster. By integrating the functions in OTB libraries into MapReduce, these RS images can be directly processed in parallel

2.! RELATED WORKS

Data Management

Data Partition Period

Map Period

Cluster Environment

5.! CONCLUSION & DISCUSSION

Experiment Results

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences	Publication Date: Jul 10, 2015
Citations: 17	License type: CC BY 3.0

R Discovery Prime

R Discovery Prime

A HADOOP-BASED DISTRIBUTED FRAMEWORK FOR EFFICIENT MANAGING AND PROCESSING BIG REMOTE SENSING IMAGES

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences

Lead the way for us

Similar Papers

Performance investigation of selected NoSQL databases for massive remote sensing image data storage
Yosra Hajjaji ... Imed Riadh Farah
-
Yosra Hajjaji, et. al.Yosra Hajjaji ... Imed Riadh Farah
01 Mar 2018
01 Mar 2018

ERP: An enhanced read policy for HDFS to improve read performance for files under construction
Junjie He ... Fei Hu
-
Junjie He, et. al. Junjie He ... Fei Hu
01 Dec 2015
01 Dec 2015

A Feasibility Study for MPI over HDFS
W Feng ... D Zhang
-
W Feng, et. al.W Feng ... D Zhang
22 Sep 2020
22 Sep 2020

A Scalable Cloud Platform using Matlab Distributed Computing Server Integrated with HDFS
Rahul Dutta ... B Annappa
-
Rahul Dutta, et. al.Rahul Dutta ... B Annappa
01 Dec 2012
01 Dec 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A HADOOP-BASED DISTRIBUTED FRAMEWORK FOR EFFICIENT MANAGING AND PROCESSING BIG REMOTE SENSING IMAGES

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences