Abstract
Advances in machine learning and computer vision, combined with increased access to unstructured data (e.g., images and text), have created an opportunity for automated extraction of building characteristics, cost-effectively, and at scale. These characteristics are relevant to a variety of urban and energy applications, yet are time consuming and costly to acquire with today’s manual methods. Several recent research studies have shown that in comparison to more traditional methods that are based on features engineering approach, an end-to-end learning approach based on deep learning algorithms significantly improved the accuracy of automatic building footprint extraction from remote sensing images. However, these studies used limited benchmark datasets that have been carefully curated and labeled. How the accuracy of these deep learning-based approach holds when using less curated training data has not received enough attention. The aim of this work is to leverage the openly available data to automatically generate a larger training dataset with more variability in term of regions and type of cities, which can be used to build more accurate deep learning models. In contrast to most benchmark datasets, the gathered data have not been manually curated. Thus, the training dataset is not perfectly clean in terms of remote sensing images exactly matching the ground truth building’s foot-print. A workflow that includes data pre-processing, deep learning semantic segmentation modeling, and results post-processing is introduced and applied to a dataset that include remote sensing images from 15 cities and five counties from various region of the USA, which include 8,607,677 buildings. The accuracy of the proposed approach was measured on an out of sample testing dataset corresponding to 364,000 buildings from three USA cities. The results favorably compared to those obtained from Microsoft’s recently released US building footprint dataset.
Highlights
Building footprint extraction can be used in several application areas such as population density estimation [1,2], urban planning and mapping, building energy modeling and analytics [3,4], and disaster management [5,6,7]
The semantic segmentation algorithms that have been employed to process remote sensing images were based on image feature engineering, which were often hand-crafted for each situation based on the category of object that was considered
The decision to include NY in this analysis is justified by the fact that NY and especially Manhattan is highly dense and has a very specific type of architecture that creates a real challenge to a semantic segmentation approach of detecting building footprint
Summary
Building footprint extraction can be used in several application areas such as population density estimation [1,2], urban planning and mapping, building energy modeling and analytics [3,4], and disaster management [5,6,7]. The use of high-resolution remote sensing images (i.e., satellite and aerial) have been increasingly explored to obtain building footprint information. While the identification of building geometry from this type of imagery is time consuming and costly to perform manually, automatic feature extraction methods hold great promise. Image semantic segmentation methods, which address the problem of assigning a categorical label (class) to each pixel of an image, is one of the most commonly studied approaches for automatic extraction of features from remote sensing images. Due to the high variability of building footprint appearances, environmental characteristics such as surrounding vegetation and terrain conditions, as well as the impact of the type of sensor that has been used to collect the imagery (e.g., different resolution), Remote Sens.
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have