Abstract
Deep learning is increasingly popular in remote sensing communities and already successful in land cover classification and semantic segmentation. However, most studies are limited to the utilization of optical datasets. Despite few attempts applied to synthetic aperture radar (SAR) using deep learning, the huge potential, especially for the very high resolution (VHR) SAR, are still underexploited. Taking building segmentation as an example, the VHR SAR datasets are still missing to the best of our knowledge. A comparable baseline for SAR building segmentation does not exist, and which segmentation method is more suitable for SAR image is poorly understood. This article first provides a benchmark high-resolution (1 m) GaoFen-3 SAR datasets, which cover nine cities from seven countries, review the state-of-the-art semantic segmentation methods applied to SAR, and then summarize the potential operations to improve the performance. With these comprehensive assessments, we hope to provide the recommendation and roadmap for future SAR semantic segmentation.
Highlights
D UE to the reason that building is the main component in urban cities, building semantic segmentation attracts more attention in urban remote sensing studies
high-resolution net (HRNet) obtained the best performance for RGB and synthetic aperture radar (SAR) datasets in terms of IoU and F1 scores, followed by U-Net
We investigated the performance with different pretraining weights, including Imagenet, Instagram, SSL on Imagenet, SWSL on Imagenet, from the encoder of ResNeXt101_32×8 d
Summary
D UE to the reason that building is the main component in urban cities, building semantic segmentation attracts more attention in urban remote sensing studies. XIA et al.: BENCHMARK HIGH-RESOLUTION GAOFEN-3 SAR DATASET FOR BUILDING SEMANTIC SEGMENTATION instance, Shahzad et al [19] adopted the integration of fully convolution neural networks and conditional random field to detect buildings of TerraSAR-X SAR image. Yao et al [24] constructed the datasets from three data sources (with a resolution of 2.9 m): TerraSAR-X images, Google Earth images, and Open Street Map (OSM) data, to perform SAR and optical image semantic segmentation. We included the Google Earth image as optical images to thoroughly investigate the performance between different modality and their combinations using deep-learning baseline models These baseline models are fundamental to the community, which can help us to deeply understand the capability of stateof-the-art segmentation models for working with SAR data. 3) The influence and the potential solution to improve the performance is given
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have