Abstract
Accurate building footprint polygons provide essential data for a wide range of urban applications. While deep learning models have been proposed to extract pixel-based building areas from remote sensing imagery, the direct vectorization of pixel-based building maps often leads to building footprint polygons with irregular shapes that are inconsistent with real building boundaries, making it difficult to use them in geospatial analysis. In this study, we proposed a novel deep learning-based framework for automated extraction of building footprint polygons (DLEBFP) from very high-resolution aerial imagery by combining deep learning models for different tasks. Our approach uses the U-Net, Cascade R-CNN, and Cascade CNN deep learning models to obtain building segmentation maps, building bounding boxes, and building corners, respectively, from very high-resolution remote sensing images. We used Delaunay triangulation to construct building footprint polygons based on the detected building corners with the constraints of building bounding boxes and building segmentation maps. Experiments on the Wuhan University building dataset and ISPRS Vaihingen dataset indicate that DLEBFP can perform well in extracting high-quality building footprint polygons. Compared with the other semantic segmentation models and the vector map generalization method, DLEBFP is able to achieve comparable mapping accuracies with semantic segmentation models on a pixel basis and generate building footprint polygons with concise edges and vertices with regular shapes that are close to the reference data. The promising performance indicates that our method has the potential to extract accurate building footprint polygons from remote sensing images for applications in geospatial analysis.
Highlights
Information on the spatial distribution and changes of buildings has a wide range of applications in urban studies, such as urban planning, disaster management, population estimation, and map updating [1,2]
In addition to assessments based on the pixel-wise metrics, we computed the vertexbased F1-score (VertexF) as proposed by Chen, Wang, Waslander, and Liu [60] to evaluate the performance of the generated building footprint polygons
One reason is that the object detection method used in our approach is functionally similar to a filter that only screens pixels with high confidence of building footprints, such that it would improve the precision but impair the recall of the model
Summary
Information on the spatial distribution and changes of buildings has a wide range of applications in urban studies, such as urban planning, disaster management, population estimation, and map updating [1,2]. The spaceborne and airborne technology provides abundant remote sensing images that have become increasingly important for extracting building information [4]. 2021, 13, 3630 technology offers very high-resolution images at sub-meter spatial resolution, making them attractive to extract accurate building footprint polygons. In the very high-resolution images, the influence of mixed pixels is minor, and the scene complexity becomes a new challenge for building footprint extraction. Many government departments and industrial companies adopt manual methods to delineate the vector data of building footprints from high-resolution remote sensing images so as to obtain the vector maps that meet the accuracy requirements of surveying and mapping. As manual annotation is time-consuming and requires expertise [5], there is a need to develop an efficient and robust scheme for automated extraction of building footprint polygons from remote sensing images
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have