Abstract

Hyperspectral remote sensing has tremendous potential for monitoring land cover and water bodies from the rich spatial and spectral information contained in the images. It is a time and resource consuming task to obtain groundtruth data for these images by field sampling. A semi-supervised method for labeling and classification of hyperspectral images is presented. The unsupervised stage consists of image enhancement by feature extraction, followed by clustering for labeling and generating the groundtruth image. The supervised stage for classification consists of a preprocessing stage involving normalization, computation of principal components, and feature extraction. An ensemble of machine learning models takes the extracted features and groundtruth data from the unsupervised stage as input and a decision block then combines the output of the machines to label the image based on majority voting. The ensemble of machine learning methods includes support vector machines, gradient boosting, Gaussian classifier, and linear perceptron. Overall, the gradient boosting method gives the best performance for supervised classification of hyperspectral images. The presented ensemble method is useful for generating labeled data for hyperspectral images that do not have groundtruth information. It gives an overall accuracy of 93.74% for the Jasper hyperspectral image, 100% accuracy for the HSI2 Lake Erie images, and 99.92% for the classification of cyanobacteria or harmful algal blooms and surface scum. The method distinguishes well between blue green algae and surface scum. The full pipeline ensemble method for classifying Lake Erie images in a cloud server runs 24 times faster than a workstation.

Highlights

  • Hyperspectral imaging (HSI) provides a high density of spectral information in the hundreds of bands of the imaged material

  • Input: dataset train, label for dataset train, tolerance, kernel, depth, estimators Output: Models Begin: Initialize variables for accuracy, F1 score, confusion matrix for the models For 10-fold cross validation of the data compute Support Vector Machines (SVM) Model using data, label, and tolerance compute Gradient Boost Classifier (GB) Model using data, label, estimators, and depth compute Linear Perceptron (LP) Model using data, label, and tolerance compute GC Model using data, label, and kernel compute accuracy score for the four models compute F1 score for the four models compute confusion matrix score for the four models save (SVM Model, GB Model, LP Model, GC Model) append accuracy, F1-score, confusion matrix Return Models, metrics The unsupervised machine learning block proposed is composed of four machine learning methods: SVM, GB, GC, LP

  • Image 1 is of size × 960, where 3270 is the 3n2u7m0b×er9o6f0l,iwnehse, raen3d297600isisththeennuummbbeerroofflsinaemsp, alensdp9e6r0liinset.hIemnaugme ber of samples per line. 2 in Figure 7b is of size 44Im44a×ge9620i,nwFhigeurere4474b4isisotfhseizneu4m4b44er×of9l6in0,ews, haenrde9464044isitshtehneunmumbebrer of lines, and 960 is the of samples per line

Read more

Summary

Introduction

Hyperspectral imaging (HSI) provides a high density of spectral information in the hundreds of bands of the imaged material. Most modern hyperspectral sensors have a high spatial resolution enabling the images to have a range of applications in agriculture, ecosystem monitoring, astronomy, molecular biology, biomedical imaging, geosciences, physics, and surveillance. There are linear and nonlinear methods for hyperspectral unmixing [1]. They can be used to gain preliminary knowledge on the site before embarking on a field campaign. These images are useful for informed decision-making on a terrestrial or aquatic ecosystem

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call