Abstract

With the advent of high-spatial resolution (HSR) satellite imagery, urban land use/land cover (LULC) mapping has become one of the most popular applications in remote sensing. Due to the importance of context information (e.g., size/shape/texture) for classifying urban LULC features, Geographic Object-Based Image Analysis (GEOBIA) techniques are commonly employed for mapping urban areas. Regardless of adopting a pixel- or object-based framework, the selection of a suitable classifier is of critical importance for urban mapping. The popularity of deep learning (DL) (or deep neural networks (DNNs)) for image classification has recently skyrocketed, but it is still arguable if, or to what extent, DL methods can outperform other state-of-the art ensemble and/or Support Vector Machines (SVM) algorithms in the context of urban LULC classification using GEOBIA. In this study, we carried out an experimental comparison among different architectures of DNNs (i.e., regular deep multilayer perceptron (MLP), regular autoencoder (RAE), sparse, autoencoder (SAE), variational autoencoder (AE), convolutional neural networks (CNN)), common ensemble algorithms (Random Forests (RF), Bagging Trees (BT), Gradient Boosting Trees (GB), and Extreme Gradient Boosting (XGB)), and SVM to investigate their potential for urban mapping using a GEOBIA approach. We tested the classifiers on two RS images (with spatial resolutions of 30 cm and 50 cm). Based on our experiments, we drew three main conclusions: First, we found that the MLP model was the most accurate classifier. Second, unsupervised pretraining with the use of autoencoders led to no improvement in the classification result. In addition, the small difference in the classification accuracies of MLP from those of other models like SVM, GB, and XGB classifiers demonstrated that other state-of-the-art machine learning classifiers are still versatile enough to handle mapping of complex landscapes. Finally, the experiments showed that the integration of CNN and GEOBIA could not lead to more accurate results than the other classifiers applied.

Highlights

  • Notwithstanding that in several previous studies the importance of multiscale mapping has been shown [3,6,28], almost all the similar studies chose to use a single-scale comparison [19,25,27,29]. Taking into account these limitations, this study focused on the comparison of two popular variants of stacked autoencoders (i.e., sparse autoencoder (SAE) and variational autoencoders (VAE)), multilayer perceptron (MLP), and convolutional neural networks (CNNs) for Geographic Object-Based Image Analysis (GEOBIA) land use/land cover (LULC) classification

  • The Bagging Trees (BT) achieved the worst overall accuracy, and this can likely be ascribed to the fact that it was unable to handle the high dimensionality of the data

  • The RF model reduces the correlation between the decision tree (DT) in the ensemble through random sampling of features, which led to an increase in classification accuracy in our experiment

Read more

Summary

Introduction

Some urban LULC features like buildings have spectral and spatial properties that may vary widely even within a single urban area. Since a single pixel in an HSR image represents just a small part of an LULC object (e.g., building rooftop or tree crown), pixel-wise classification cannot properly model the variability of different LULC types. To put it differently, the lack of extra information (i.e., spectral, spatial, and textural) hinders pixel-wise classification schemes from correctly assigning individual pixels to their real-world land classes

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call