Hypergraph+: An Improved Hypergraph-Based Task-Scheduling Algorithm for Massive Spatial Data Processing on Master-Slave Platforms

Bo Cheng,Xuefeng Guan,Rui Li,Huayi Wu

doi:10.3390/ijgi5080141

Abstract

Spatial data processing often requires massive datasets, and the task/data scheduling efficiency of these applications has an impact on the overall processing performance. Among the existing scheduling strategies, hypergraph-based algorithms capture the data sharing pattern in a global way and significantly reduce total communication volume. Due to heterogeneous processing platforms, however, single hypergraph partitioning for later scheduling may be not optimal. Moreover, these scheduling algorithms neglect the overlap between task execution and data transfer that could further decrease execution time. In order to address these problems, an extended hypergraph-based task-scheduling algorithm, named Hypergraph+, is proposed for massive spatial data processing. Hypergraph+ improves upon current hypergraph scheduling algorithms in two ways: (1) It takes platform heterogeneity into consideration offering a metric function to evaluate the partitioning quality in order to derive the best task/file schedule; and (2) It can maximize the overlap between communication and computation. The GridSim toolkit was used to evaluate Hypergraph+ in an IDW spatial interpolation application on heterogeneous master-slave platforms. Experiments illustrate that the proposed Hypergraph+ algorithm achieves on average a 43% smaller makespan than the original hypergraph scheduling algorithm but still preserves high scheduling efficiency.

Highlights

In recent years, with the rapid development of surveying and remote sensing technologies, the volume of spatial data has increased dramatically [1,2,3]
We propose an extended hypergraph-based task-scheduling algorithm, named Hypergraph+
Since the task execution time can be defined in terms of million instructions (MI), the CPU resource speed was modeled as million instructions per second (MIPS)

Summary

Introduction

With the rapid development of surveying and remote sensing technologies, the volume of spatial data has increased dramatically [1,2,3]. Spatial data processing is a typical type of data-intensive applications where users must access and process massive spatial data. Each task requires a subset of input files from the storage nodes; a task may share a number of files with other tasks, while an individual task is submitted to one computing node for execution. The computing nodes themselves are connected to the storage nodes for data transfer through a network. This collaboration is orchestrated by a task/data scheduling strategy; scheduling strategy efficiency has an important influence on collaboration performance

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: ISPRS International Journal of Geo-Information	Publication Date: Aug 10, 2016
Citations: 7	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Hypergraph+: An Improved Hypergraph-Based Task-Scheduling Algorithm for Massive Spatial Data Processing on Master-Slave Platforms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ISPRS International Journal of Geo-Information

Lead the way for us

Similar Papers

A hypergraph based task scheduling strategy for massive parallel spatial data processing on master-slave platforms
Bo Cheng ... Xuefeng Guan
-
Bo Cheng, et. al. Bo Cheng ... Xuefeng Guan
01 Jun 2015
01 Jun 2015

Task Scheduling of Massive Spatial Data Processing across Distributed Data Centers: What's New?
Weijing Song ... Dingsheng Liu
-
Weijing Song, et. al.Weijing Song ... Dingsheng Liu
01 Dec 2011
01 Dec 2011

A Parallel Framework for Processing Massive Spatial Data with a Split–and–Merge Paradigm
Xuefeng Guan ... Lin Li
Transactions in GIS | VOL. 16
Xuefeng Guan, et. al.Xuefeng Guan ... Lin Li
01 Dec 2012
Transactions in GIS | VOL. 16

A ROBUST PARALLEL FRAMEWORK FOR MASSIVE SPATIAL DATA PROCESSING ON HIGH PERFORMANCE CLUSTERS
X Guan
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences | VOL. XXXIX-B4
X GuanX Guan
31 Jul 2012
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences | VOL. XXXIX-B4

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Hypergraph+: An Improved Hypergraph-Based Task-Scheduling Algorithm for Massive Spatial Data Processing on Master-Slave Platforms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ISPRS International Journal of Geo-Information