Abstract
Automatic building extraction from optical imagery remains a challenge due to, for example, the complexity of building shapes. Semantic segmentation is an efficient approach for this task. The latest development in deep convolutional neural networks (DCNNs) has made accurate pixel-level classification tasks possible. Yet one central issue remains: the precise delineation of boundaries. Deep architectures generally fail to produce fine-grained segmentation with accurate boundaries due to their progressive down-sampling. Hence, we introduce a generic framework to overcome the issue, integrating the graph convolutional network (GCN) and deep structured feature embedding (DSFE) into an end-to-end workflow. Furthermore, instead of using a classic graph convolutional neural network, we propose a gated graph convolutional network, which enables the refinement of weak and coarse semantic predictions to generate sharp borders and fine-grained pixel-level classification. Taking the semantic segmentation of building footprints as a practical example, we compared different feature embedding architectures and graph neural networks. Our proposed framework with the new GCN architecture outperforms state-of-the-art approaches. Although our main task in this work is building footprint extraction, the proposed method can be generally applied to other binary or multi-label segmentation tasks.
Highlights
Building footprint generation is an active topic in remote sensing field
We propose a gated graph convolutional network, which is a trainable inference systems based on GCN and recurrent neural network (RNN) with gated recurrent units (GRUs)
The finer details are captured by the proposed framework with different graph models such as CRFasRNN, GCN, and GGCN rather than CNN-only methods, which confirms the effectiveness of the graph model in modelling the interaction among pixels and spatial information propagation
Summary
Building footprint generation is an active topic in remote sensing field. Recently, it has received considerable attention due to its huge potential in autonomous driving, virtual reality, urban planning, environmental, and demographic applications. Semantic segmentation is a comparatively inexpensive and time-saving technique for extracting building footprints. It aims to classify each pixel with a corresponding class. Various semi-automatic and automatic methods (Ok, 2013; Xu et al, 2018; Bittner et al, 2018; Chen et al, 2019) have been developed to improve segmentation accuracy within this method; traditionally, feature extraction and classification are its two main steps. The extraction of such hand-crafted features usually require a strong domain-specific knowledge
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: ISPRS Journal of Photogrammetry and Remote Sensing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.