Abstract

Extracting buildings from high-resolution remote sensing images is essential for many geospatial applications, such as building change detection, urban planning, and disaster emergency assessment. Due to the diversity of geometric shapes and the blurring of boundaries among buildings, it is still a challenging task to accurately generate building footprints from the complex scenes of remote sensing images. The rapid development of convolutional neural networks is presenting both new opportunities and challenges with respect to the extraction of buildings from high-resolution remote sensing images. To capture multilevel contextual information, most deep learning methods extract buildings by integrating multilevel features. However, the differential responses between such multilevel features are often ignored, leading to blurred contours in the extraction results. In this study, we propose an end-to-end multitask building extraction method to address these issues; this approach utilizes the rich contextual features of remote sensing images to assist with building segmentation while ensuring that the shape of the extraction results is preserved. By combining boundary classification and boundary distance regression, clear contour and distance transformation maps are generated to further improve the accuracy of building extraction. Subsequently, multiple refinement modules are used to refine each part of the network to minimize the loss of image feature information. Experimental comparisons conducted on the SpaceNet and Massachusetts building datasets show that the proposed method outperforms other deep learning methods in terms of building extraction results.

Highlights

  • Buildings, as important manifestations of urban development and construction, are becoming a newly popular focus of research in regard to segmentation tasks

  • We propose a multitask deep learning model for building extraction that consists of an encoder–decoder network for generating coarse building maps, with two auxiliary decoders for correcting blurred boundaries, and a multitask loss is applied to monitor the accuracy of the building extraction process

  • Compared with that of U-Net, the intersection over union (IoU) value of the proposed method is increased by 1.3%, which proves that the proposed method can achieve improved building extraction performance

Read more

Summary

Introduction

As important manifestations of urban development and construction, are becoming a newly popular focus of research in regard to segmentation tasks. Building footprint maps based on such high-resolution remote sensing images are essential for building change detection [1], urban planning [2], and disaster emergency assessment. Since high-resolution remote sensing images contain a large amount of intrinsic characteristic information, such as spectral [3,4], textural [4,5], geometric [6,7], and contextual information [8], most studies on building extraction algorithms are based on features of these kinds. U-Net was proposed by Ronneberger et al [13] and was originally used to solve medical image segmentation problems. It has a U-shaped network structure that includes an encoder and decoder. The skip connections in the decoder stage effectively combine low-level features with high-level features to recover the lost feature information

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call