Abstract
Building extraction with high accuracy using semantic segmentation from high-resolution remotely sensed imagery has a wide range of applications like urban planning, updating of geospatial database, and disaster management. However, automatic building extraction with non-noisy segmentation map and obtaining accurate boundary information is a big challenge for most of the popular deep learning methods due to the existence of some barriers like cars, vegetation cover and shadow of trees in the high-resolution remote sensing imagery. Thus, we introduce an end-to-end convolutional neural network called Generative Adversarial Network (GAN) in this study to tackle these issues. In the generative model, we utilized SegNet model with Bi-directional Convolutional LSTM (BConvLSTM) to generate the segmentation map from Massachusetts building dataset containing high-resolution aerial imagery. BConvLSTM combines encoded features (containing of more local information) and decoded features (containing of more semantic information) to improve the performance of the model even with the presence of complex backgrounds and barriers. The adversarial training method enforces long-range spatial label vicinity to tackle with the issue of covering building objects with the existing occlusions such as trees, cars and shadows and achieve high-quality building segmentation outcomes under the complex areas. The quantitative results obtained by the proposed technique with an average F1-score of 96.81% show that the suggested approach could achieve better results through detecting and adjusting the difference between the segmentation model output and the reference map compared to other state-of-the-art approaches such as autoencoder method with 91.36%, SegNet+BConvLSTM with 95.96%, FCN-CRFs with 95.36%% SegNet with 94.77%, and GAN-SCA model with 96.36% accuracy.
Highlights
One of the main stages in geospatial information system (GIS) applications, such as change detection, urban land use analysis, geospatial database updating, and infrastructureThe associate editor coordinating the review of this manuscript and approving it for publication was Wenming Cao .planning, is the automatic extraction of features, like building objects from high-resolution remote sensing imagery
Effective state-of-the-art proficiency has been obtained using deep learning models for semantic classification [6], [7] by some methods that have been suggested in the field of computer vision, such as integrated convolutional neural network (CNN) models with segmentation [8], the deep parsing model [9], patch network [10], the combination of conditional random fields (CRFs) with CNN [11], DeepLab [12], decoupled networks [13], deconvolutional models [14], and SegNet [15], and in the field of remote sensing [16], [17]
The proposed model was trained with a batch size 1 for 100 epochs, and the entire process of the suggested approach for building extraction from remote sensing imagery was performed on a GPU Nvidia Quadro RTX 6000 with a memory of 24 GB and a computing capability of 7.5 under the framework of Keras with a Tensorflow back-end
Summary
The associate editor coordinating the review of this manuscript and approving it for publication was Wenming Cao. planning, is the automatic extraction of features, like building objects from high-resolution remote sensing imagery. A. Abdollahi et al.: Building Footprint Extraction from High Resolution Aerial Images of urban domains [1]. Other challenges exist in these images, such as shadows, overlapping, and interlacing sheltering [2]. These challenges are even more pronounced in the extraction of urban features, like building objects [3]. Designing a robust approach that could achieve high-precision features extraction on high-resolution remote sensing imagery is very difficult [5]. Effective state-of-the-art proficiency has been obtained using deep learning models for semantic classification [6], [7] by some methods that have been suggested in the field of computer vision, such as integrated convolutional neural network (CNN) models with segmentation [8], the deep parsing model [9], patch network [10], the combination of conditional random fields (CRFs) with CNN [11], DeepLab [12], decoupled networks [13], deconvolutional models [14], and SegNet [15], and in the field of remote sensing [16], [17]
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have