Semantic Segmentation of Aerial Imagery via Split-Attention Networks with Disentangled Nonlocal and Edge Supervision

Cheng Zhang,Qing Zhao,Wanshou Jiang

doi:10.3390/rs13061176

Abstract

In this work, we propose a new deep convolution neural network (DCNN) architecture for semantic segmentation of aerial imagery. Taking advantage of recent research, we use split-attention networks (ResNeSt) as the backbone for high-quality feature expression. Additionally, a disentangled nonlocal (DNL) block is integrated into our pipeline to express the inter-pixel long-distance dependence and highlight the edge pixels simultaneously. Moreover, the depth-wise separable convolution and atrous spatial pyramid pooling (ASPP) modules are combined to extract and fuse multiscale contextual features. Finally, an auxiliary edge detection task is designed to provide edge constraints for semantic segmentation. Evaluation of algorithms is conducted on two benchmarks provided by the International Society for Photogrammetry and Remote Sensing (ISPRS). Extensive experiments demonstrate the effectiveness of each module of our architecture. Precision evaluation based on the Potsdam benchmark shows that the proposed DCNN achieves competitive performance over the state-of-the-art methods.

Highlights

IntroductionUrbanization, globalization and sometimes even disasters lead to rapid changes in the type of Land use and land cover (LULC) [1]
101-layer “RS+Edge+DNLLAST ” network based on the Potsdam dataset and compared it with the state-of-the-art semantic deep convolution neural network (DCNN) networks in the field of remote sensing
We propose a novel convolutional neural network based on ResNeSt for semantic segmentation of aerial imagery