Abstract

Anomaly detection aims to separate anomalous pixels from the background, and has become an important application of remotely sensed hyperspectral image processing. Anomaly detection methods based on low-rank and sparse representation (LRASR) can accurately detect anomalous pixels. However, with the significant volume increase of hyperspectral image repositories, such techniques consume a significant amount of time (mainly due to the massive amount of matrix computations involved). In this paper, we propose a novel distributed parallel algorithm (DPA) by redesigning key operators of LRASR in terms of MapReduce model to accelerate LRASR on cloud computing architectures. Independent computation operators are explored and executed in parallel on Spark. Specifically, we reconstitute the hyperspectral images in an appropriate format for efficient DPA processing, design the optimized storage strategy, and develop a pre-merge mechanism to reduce data transmission. Besides, a repartitioning policy is also proposed to improve DPA’s efficiency. Our experimental results demonstrate that the newly developed DPA achieves very high speedups when accelerating LRASR, in addition to maintaining similar accuracies. Moreover, our proposed DPA is shown to be scalable with the number of computing nodes and capable of processing big hyperspectral images involving massive amounts of data.

Highlights

  • During recent years, hyperspectral remote sensing has been widely used in various fields ofEarth observation and space exploration [1,2,3,4,5,6]

  • Band interleaved by line (BIL), band interleaved by pixel (BIP) and band sequential (BSQ) are three common formats for arranging hyperspectral image (HSI)’ data

  • In the first two experiments, HSI1 and HSI2 are processed by low-rank and sparse representation (LRASR) and our proposed distributed parallel algorithm (DPA) on Spark1, respectively

Read more

Summary

Introduction

Hyperspectral remote sensing has been widely used in various fields of. From this figure, it can be seen that, after reading data from disks to obtain a HSI X, LRASR first employs the K-means algorithm to cluster all pixels, and uses a dictionary construction method to obtain the background dictionary matrix D based on the obtained K clusters. Our newly developed DPA first explores the independent computation operators in LRASR and processes them in parallel on multiple nodes provided by Spark. We establish a distributed parallel dictionary construction method, in which a new repartition policy is developed to use K nodes for the purpose of executing the processing operators of clusters in parallel. Before detailing the three distributed parallel methods, we need to first describe data organization and storage optimization methods, which help the proposed DPA to reduce the data transmission and improve the efficiency

Data Organization and Storage Optimization Methods
Distributed Parallel K-Means Algorithm
Distributed Parallel Dictionary Construction
Distributed Parallel ADMM
Comparison and Analysis
Experimental Results
Experiments
Conclusions and Future Work
Background
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call