Abstract

Automatic building extraction from optical imagery remains a challenge due to, for example, the complexity of building shapes. Semantic segmentation is an efficient approach for this task. The latest development in deep convolutional neural networks (DCNNs) has made accurate pixel-level classification tasks possible. Yet one central issue remains: the precise delineation of boundaries. Deep architectures generally fail to produce fine-grained segmentation with accurate boundaries due to their progressive down-sampling. Hence, we introduce a generic framework to overcome the issue, integrating the graph convolutional network (GCN) and deep structured feature embedding (DSFE) into an end-to-end workflow. Furthermore, instead of using a classic graph convolutional neural network, we propose a gated graph convolutional network, which enables the refinement of weak and coarse semantic predictions to generate sharp borders and fine-grained pixel-level classification. Taking the semantic segmentation of building footprints as a practical example, we compared different feature embedding architectures and graph neural networks. Our proposed framework with the new GCN architecture outperforms state-of-the-art approaches. Although our main task in this work is building footprint extraction, the proposed method can be generally applied to other binary or multi-label segmentation tasks.

Highlights

  • Building footprint generation is an active topic in remote sensing field

  • We propose a gated graph convolutional network, which is a trainable inference systems based on GCN and recurrent neural network (RNN) with gated recurrent units (GRUs)

  • The finer details are captured by the proposed framework with different graph models such as CRFasRNN, GCN, and GGCN rather than CNN-only methods, which confirms the effectiveness of the graph model in modelling the interaction among pixels and spatial information propagation

Read more

Summary

Introduction

Building footprint generation is an active topic in remote sensing field. Recently, it has received considerable attention due to its huge potential in autonomous driving, virtual reality, urban planning, environmental, and demographic applications. Semantic segmentation is a comparatively inexpensive and time-saving technique for extracting building footprints. It aims to classify each pixel with a corresponding class. Various semi-automatic and automatic methods (Ok, 2013; Xu et al, 2018; Bittner et al, 2018; Chen et al, 2019) have been developed to improve segmentation accuracy within this method; traditionally, feature extraction and classification are its two main steps. The extraction of such hand-crafted features usually require a strong domain-specific knowledge

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call