Abstract

Remote sensing image semantic segmentation, which aims to realize pixel-level classification according to the content of remote sensing images, has broad applications in various fields. Thanks to the superiority of deep learning (DL), the semantic segmentation model based on the convolutional neural network (CNN) dramatically promotes the development of remote sensing image semantic segmentation. Due to the high resolution, comprehensive coverage, extensive data, and sizeable spectral difference of high-resolution remote sensing images (HRRSI), the existing GPU is not suitable for directly semantic segmentation of the whole image. Cutting the image into small patches will lead to the loss of context information, resulting in the decline of accuracy. To address this issue, we propose the multiscale context self-attention network (MSCSANet). It combines the benefits of the self-attention mechanism with CNN to improve the segmentation quality of various remote sensing images. The MSCSANet extracts multiscale features from multiscale context images to solve the problem of feature loss caused by image segmentation. In addition, in order to make use of the feature of large-scale context, the multiscale context patches are used to guide the local image patch to focus on different fine-grained objects to enhance the feature of the local image patch. Moreover, considering the limited computing resources, we designed a linear self-attention module to reduce the computational complexity. Compared with other DL models, our proposed model can enhance the ability of multiscale features in complex scenes, and realizes improvements of 1.56% mean intersection over union (MIoU) on the Gaofen Image Dataset and 1.93% MIoU on the ISPRS Potsdam Dataset, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call