Abstract

Building footprint segmentation from high-resolution remote sensing (RS) images plays a vital role in urban planning, disaster response, and population density estimation. Convolutional neural networks (CNNs) have been recently used as a workhorse for effectively generating building footprints. However, to completely exploit the prediction power of CNNs, large-scale pixel-level annotations are required. Most state-of-the-art methods based on CNNs are focused on the design of network architectures for improving the predictions of building footprints with full annotations, while few works have been done on building footprint segmentation with limited annotations. In this article, we propose a novel semisupervised learning method for building footprint segmentation, which can effectively predict building footprints based on the network trained with few annotations (e.g., only $\text{0.0324 {km}}^2$ out of $\text{2.25-{km}}^2$ area is labeled). The proposed method is based on investigating the contrast between the building and background pixels in latent space and the consistency of predictions obtained from the CNN models when the input RS images are perturbed. Thus, we term the proposed semisupervised learning framework of building footprint segmentation as PiCoCo , which is based on the enforcement of Pi xelwise Co ntrast and Co nsistency during the learning phase. Our experiments, conducted on two benchmark building segmentation datasets, validate the effectiveness of our proposed framework as compared to several state-of-the-art building footprint extraction and semisupervised semantic segmentation methods.

Highlights

  • S EGMENTING building footprints from high-resolution remote sensing (RS) images has become a basic task within the field of intelligent RS image interpretation

  • By taking advantages of limited areas with labels and large amounts of unlabeled areas, we propose a novel semi-supervised building footprint segmentation framework–PiCoCo, which is based on contrast and consistency learning in a pixelwise manner

  • PiCoCo is constructed on two learning concepts: 1) contrast learning, which is aimed at learning compact and discriminative latent representation space for distinguishing building and background pixels; 2) consistency learning, which is targeted at imposing the prediction consistency of the models on different perturbations of input images

Read more

Summary

INTRODUCTION

S EGMENTING building footprints from high-resolution remote sensing (RS) images has become a basic task within the field of intelligent RS image interpretation. By simultaneously and automatically learning low- and high-level features from massive RS images, CNNs can effectively capture both the shape and semantic information for building regions, which significantly improve the generalization and robustness capabilities for the footprint segmentation of diverse RS images compared to the conventional methods [24]. PiCoCo is constructed on two learning concepts: 1) contrast learning, which is aimed at learning compact and discriminative latent representation space for distinguishing building and background pixels; 2) consistency learning, which is targeted at imposing the prediction consistency of the models on different perturbations of input images To this end, the main contributions of the proposed framework can be summarized as follows: 1) A novel semi-supervised building footprint segmentation framework, i.e., PiCoCo, is proposed for effectively learning the building footprints with only around 1% pixels existing labels.

Building Footprint Segmentation
Semi-Supervised Semantic Segmentation
METHODOLOGY
Notations
PiCoCo
Experimental Setup
Experimental Results
Method
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.