Abstract

Deep learning techniques such as convolutional neural networks have largely improved the performance of building segmentation from remote sensing images. However, the images for building segmentation are often in the form of traditional orthophotos, where the relief displacement would cause non-negligible misalignment between the roof outline and the footprint of a building; such misalignment poses considerable challenges for extracting accurate building footprints, especially for high-rise buildings. Aiming at alleviating this problem, a new workflow is proposed for generating rectified building footprints from traditional orthophotos. We first use the facade labels, which are prepared efficiently at low cost, along with the roof labels to train a semantic segmentation network. Then, the well-trained network, which employs the state-of-the-art version of EfficientNet as backbone, extracts the roof segments and the facade segments of buildings from the input image. Finally, after clustering the classified pixels into instance-level building objects and tracing out the roof outlines, an energy function is proposed to drive the roof outline to maximally align with the building footprint; thus, the rectified footprints can be generated. The experiments on the aerial orthophotos covering a high-density residential area in Shanghai demonstrate that the proposed workflow can generate obviously more accurate building footprints than the baseline methods, especially for high-rise buildings.

Highlights

  • High precision building footprints are one of the most important elements within the geographic vector map of cities, which plays a significant role in many fields, such as urban planning, post-disaster management, carbon emission calculation, and location-based services

  • Adopting true orthophotos instead of traditional ones can theoretically remove the residual tilt of buildings, the production of a true orthophoto relies heavily on a high-quality digital surface model (DSM) [7] or digital building model (DBM) [8]; the DSM could be unavailable in many situations or may have limited quality, while the acquisition of DBM includes the building detection target itself to some extent

  • Different from previous studies, the tilt effect of buildings is fully considered in our workflow

Read more

Summary

Introduction

High precision building footprints are one of the most important elements within the geographic vector map of cities, which plays a significant role in many fields, such as urban planning, post-disaster management, carbon emission calculation, and location-based services. The successful application of deep learning techniques such as convolutional neural networks (CNNs) has greatly improved the accuracy of automatic building detection from remote sensing images [1,2,3]. Despite this achievement, few studies have been proven capable of accurately extracting building footprints (i.e., the boundaries where the building facades meet the ground) from the traditional orthophotos; instead, several previous works focus on segmenting roof surfaces from the input image [4,5,6]. Adopting true orthophotos instead of traditional ones can theoretically remove the residual tilt of buildings, the production of a true orthophoto relies heavily on a high-quality digital surface model (DSM) [7] or digital building model (DBM) [8]; the DSM could be unavailable in many situations (e.g., the DSM can hardly be derived when the imaging sensor is monocular) or may have limited quality, while the acquisition of DBM includes the building detection target itself to some extent

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.