Abstract

Building footprint information is one of the key factors for sustainable urban planning and environmental monitoring. Mapping building footprints from remote sensing images is an important and challenging task in the earth observation field. Over the years, convolutional neural networks have shown outstanding improvements in the building extraction field due to their ability to automatically extract hierarchical features and make building predictions. However, as buildings are various in different sizes, scenes, and roofing materials, it is hard to precisely depict buildings of varied sizes, especially in large areas (e.g., nationwide). To tackle these limitations, we propose a novel deep-supervision convolutional neural network (denoted as DS-Net) for extracting building footprints from high-resolution remote sensing images. In the proposed network, we applied deep supervision with an extra lightweight encoder, which enables the network to learn representative building features of different scales. Furthermore, a scale attention module is designed to aggregate multiscale features and generate the final building prediction. Experiments on two publicly available building datasets, including the WHU Building Dataset and the Massachusetts Building Dataset, show the effectiveness of the proposed method. With only a 0.22-M increment of parameters compared with U-Net, the proposed DS-Net achieved an IoU of 90.4% on the WHU Building Dataset and 73.8% on the Massachusetts Dataset. DS-Net also outperforms the state-of-the-art building extraction methods on the two datasets, indicating the effectiveness of the proposed deep supervision and scale attention.

Highlights

  • Building footprint extraction is one of the research hotspots in the remote sensing field due to the broad application of building information[1], [2]

  • We selected the aerial subset of the WHU Building Dataset, which covers various buildings of different appearances and scales

  • (4) SRI-Net: The spatial residual network[32], termed as SRI-Net, is a building extraction fully convolutional networks (FCN) designed by Liu et al SRI-Net is capable of retaining global semantic information and ON THE WHU BUILDING DATASET

Read more

Summary

INTRODUCTION

Building footprint extraction is one of the research hotspots in the remote sensing field due to the broad application of building information[1], [2]. Much effort has been made to fuse the multiscale building features, there is still a trade-off between the benefits of low-level details and high-level semantics; how to adaptively integrate information of different scales remains a challenge To tackle these problems, in this article, we designed a novel deep-supervision fully convolutional network (denoted as DSNet) for building footprint extraction. In DS-Net, we designed a lightweight and effective deep supervision sub network to boost the model’s robustness to buildings of different scales, considering the large scale-variant among buildings in the real-world datasets This deep supervision subnetwork is capable of generating multiscale building predictions, which enables the model to learn more representative deep features of buildings with varying scales.

METHODOLOGY
Encoder-decoder architecture
Scale Attention Module
Dataset Descriptions
Experimental Details
Evaluation Metrics
Comparison methods
Results and Analysis
Method
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call