Abstract

Accurate building extraction from very high-resolution (VHR) remote sensing images plays an important role in urban dynamic monitoring, planning, and management. However, it is still a challenging task to achieve building extraction with high accuracy and integrity due to diverse building appearances and more complex ground background in VHR remote sensing images. Recently, Unity Networking (UNet) has been proven to be capable of feature extraction and semantic segmentation of remote sensing images. However, UNet cannot achieve sufficient multi-scale and multi-level features with larger receptive fields. To address these problems, an improved network based on UNet structure (Refine-UNet) is proposed for extracting buildings from the VHR images. The proposed Refine-UNet mainly consists of encoder module, decoder module, and refine skip connection scheme. The refine skip connection scheme is composed of an atrous spatial convolutional pyramid pooling (ASPP) module and several improved depthwise separable convolution (IDSC) modules. Experimental results on the Jilin-1 VHR datasets with a spatial resolution of 0.75 m demonstrate that compared with UNet, PSPNet, DeepLabV3+, and SegNet, the proposed Refine-UNet can obtain more accurate building extraction results and achieve the best Precision of 95.1% and intersection over union (IoU) of 87.0%, respectively, indicating the great practical potential.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call