Extraction of building footprint using MASK-RCNN for high resolution aerial imagery

Jenila Vincent M,Varalakshmi P

doi:10.1088/2515-7620/ad5b3d

Abstract

Extracting individual buildings from satellite images is crucial for various urban applications, including population estimation, urban planning, and other related fields. However, Extracting building footprints from remote sensing data is a challenging task because of scale differences, complex structures and different types of building. Addressing these issues, an approach that can efficiently detect buildings in images by generating a segmentation mask for each instance is proposed in this paper. This approach incorporates the Regional Convolutional Neural Network (MASK-RCNN), which combines Faster R-CNN for object mask prediction and boundary box recognition and was evaluated against other models like YOLOv5, YOLOv7 and YOLOv8 in a comparative study to assess its effectiveness. The findings of this study reveals that our proposed method achieved the highest accuracy in building extraction. Furthermore, we performed experiments on well-established datasets like WHU and INRIA, and our method consistently outperformed other existing methods, producing reliable results.

Full Text