Abstract

Using ground-based, remote hyperspectral images from 0.4–1.0 micron in ∼850 spectral channels—acquired with the Urban Observatory facility in New York City—we evaluate the use of one-dimensional Convolutional Neural Networks (CNNs) for pixel-level classification and segmentation of built and natural materials in urban environments. We find that a multi-class model trained on hand-labeled pixels containing Sky, Clouds, Vegetation, Water, Building facades, Windows, Roads, Cars, and Metal structures yields an accuracy of 90–97% for three different scenes. We assess the transferability of this model by training on one scene and testing to another with significantly different illumination conditions and/or different content. This results in a significant (∼45%) decrease in the model precision and recall as does training on all scenes at once and testing on the individual scenes. These results suggest that while CNNs are powerful tools for pixel-level classification of very high-resolution spectral data of urban environments, retraining between scenes may be necessary. Furthermore, we test the dependence of the model on several instrument- and data-specific parameters including reduced spectral resolution (down to 15 spectral channels) and number of available training instances. The results are strongly class-dependent; however, we find that the classification of natural materials is particularly robust, especially the Vegetation class with a precision and recall >94% for all scenes and model transfers and >90% with only a single training instance.

Highlights

  • As of 2018, more than 55% of the world’s population live in urban areas with a projection of up to 68% living in cities by 2050 [1]

  • In the results described below, we trained our Convolutional Neural Networks (CNNs) models on the hand-labeled pixels from Figure 3 to address the six goals described in Section 1 that are designed to determine the models’ utility in segmenting and classifying pixels in urban hyperspectral imaging

  • All metrics obtained from the test instances in Scene 1-a indicate that for the chosen CNN architecture, models with filters consisting of 50 spectral channels (∼35 nm wide) yield optimal performance with a mean F1 score of 0.97 and testing accuracy of 94.2%

Read more

Summary

Introduction

As of 2018, more than 55% of the world’s population live in urban areas with a projection of up to 68% living in cities by 2050 [1]. Albert et al (2017) [12] used large-scale satellite RGB imagery from Google Maps’ static API of 10 European cities, classified by the open source Urban Atlas survey into 10 land use classes, to analyze patterns in land use in urban neighborhoods. Their results showed that some types of urban environments are easier to infer than others, especially for classes that tend to be visually similar like agricultural lands and airports

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call