Apache Spark Accelerated Deep Learning Inference for Large Scale Satellite Image Analytics

Dalton Lunga,Jonathan Gerrand,Lexie Yang,Robert Stewart,Christopher Layton

doi:10.1109/jstars.2019.2959707

Dalton Lunga, Jonathan Gerrand + Show 3 more

Open Access

PDF Available

https://doi.org/10.1109/jstars.2019.2959707

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

The shear volumes of data generated from earth observation and remote sensing technologies continue to make major impact; leaping key geospatial applications into the dual data and compute-intensive era. As a consequence, this rapid advancement poses new computational and data processing challenges. We implement a novel remote sensing data flow (RESFlow) for advancing machine learning to compute with massive amounts of remotely sensed imagery. The core contribution is partitioning massive amounts of data into homogeneous distributions for fitting simple models. RESFlow takes advantage of Apache Spark and the availability of modern computing hardware to harness the acceleration of deep learning inference on expansive remote sensing imagery. The framework incorporates a strategy to optimize resource utilization across multiple executors assigned to a single worker. We showcase its deployment in both computationally and data-intensive workloads for pixel-level labeling tasks. The pipeline invokes deep learning inference at three stages; during deep feature extraction, deep metric mapping, and deep semantic segmentation. The tasks impose compute-intensive and GPU resource sharing challenges motivating for a parallelized pipeline for all execution steps. To address the problem of hardware resource contention, our containerized workflow further incorporates a novel GPU checkout routine and the ticketing system across multiple workers. The workflow is demonstrated with NVIDIA DGX accelerated platforms and offers appreciable compute speed-ups for deep learning inference on pixel labeling workloads; processing 21 028 TB of imagery data and delivering output maps at area rate of 5.245 sq.km/s, amounting to 453 168 sq.km/day—reducing a 28 day workload to 21 h.

Highlights

E ARTH observation and remote-sensing are both fields that have undergone a renaissance recently, making major impacts in key geospatial applications including land cover mapping, infrastructure mapping, damage assessment, and population distribution studies [1]–[4]
2) We take advantage of Apache Spark to provide, for a single large image scene, fast parallel inference functionality wherein an area pixel labeling rate of 5.245 sq.km/s, amounting to 453 168 sq.km/day is achieved—reducing a 28 day workload to 21 h. 3) We present a containerized workflow for Apache Spark operations coordinated with GPUs for deep learning inference best practices, e.g., efficient GPU usage and ticketing across multiple workers, for large deep learning workloads deployed on GPU clusters
remote sensing data flow (RESFlow) is seen to perform very to the Mono model for two of the three test regions. This is considerable, as each model from the RESFlow Image Gallery sees considerably less data compared to its Mono model counterpart during training, and yet is able to generalize to a similar degree

Summary

Introduction

E ARTH observation and remote-sensing are both fields that have undergone a renaissance recently, making major impacts in key geospatial applications including land cover mapping, infrastructure mapping, damage assessment, and population distribution studies [1]–[4]. Remote sensing applications have leaped into a data and compute-intensive era presenting challenges and opportunities for new advanced machine learning and computer vision workflows. Examples of such applications include supporting accurate population distribution estimates, possibilities to study sustainability outcomes at scale [5], and identifying urban environments over large contexts using abundant satellite imagery and breakthroughs in deep learning based image classification [6]

Objectives

Results

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing	Publication Date: Jan 1, 2020
Citations: 37	License type: CC BY 4.0

R Discovery Prime

Apache Spark Accelerated Deep Learning Inference for Large Scale Satellite Image Analytics

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

Lead the way for us

Similar Papers

Flexibility
Gordon R Chiu ... Mohamed S Abdelfattah
-
Gordon R Chiu, et. al.Gordon R Chiu ... Mohamed S Abdelfattah
25 Mar 2018
25 Mar 2018

Deep Semantic Segmentation and Multi-Class Skin Lesion Classification Based on Convolutional Neural Network
Muhammad Almas Anjum ... Habib Ullah Khan
IEEE Access | VOL. 8
Muhammad Almas Anjum, et. al.Muhammad Almas Anjum ... Habib Ullah Khan
01 Jan 2020
IEEE Access | VOL. 8

A Generic Cryptographic Deep-Learning Inference Platform for Remote Sensing Scenes
Qian Chen ... Zoe L Jiang
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing | VOL. 16
Qian Chen, et. al.Qian Chen ... Zoe L Jiang
01 Jan 2023
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing | VOL. 16

MiniDeep: A Standalone AI-Edge Platform with a Deep Learning-Based MINI-PC and AI-QSR System
Yuh-Shyan Chen ... Hong-Lun Zhang
Sensors | VOL. 22
Yuh-Shyan Chen, et. al.Yuh-Shyan Chen ... Hong-Lun Zhang
10 Aug 2022
Sensors | VOL. 22

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Apache Spark Accelerated Deep Learning Inference for Large Scale Satellite Image Analytics

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing