Enhancing building extraction from remote sensing images through UNet and transfer learning

Smail Ait El Asri,Ismail Negabi,Samir El Adib,Naoufal Raissouni

doi:10.1080/1206212x.2023.2219117

Abstract

Performing accurate extraction of buildings from remote sensing (RS) images is a crucial process with widespread applications in urban planning, disaster management, and urban monitoring. However, this task remains challenging due to the diversity and complexity of building shapes, sizes, and textures, as well as variations in lighting and weather conditions. These difficulties motivate our research to propose an improved approach for building extraction using UNet and transfer learning to address these challenges. In this work, we tested seven different backbone architectures within the UNet encoder and found that combining UNet with ResNet101 or ResNet152 yielded the best results. Based on these findings, we combined the superior performance of these base models to create a novel architecture, which achieved significant improvements over previous methods. Specifically, our method achieved a 1.33% increase in Intersection over Union (IoU) compared to the baseline UNet model. Furthermore, it demonstrated a superior performance with a 1.21% increase in IoU over UNet-ResNet101 and a 1.60% increase in IoU over UNet-ResNet152. We evaluated our proposed approach on the INRIA Aerial Image dataset and demonstrated its superiority. Our research addresses a critical need for accurate building extraction from RS images and overcomes the challenges posed by diverse building characteristics.

Full Text