Abstract

In remote sensing images (RSIs), accurate semantic segmentation faces more challenges because of small targets, unbalanced categories, and complex scenes. Restricted by local receptive field of convolution layers, the traditional semantic segmentation models cannot use global information of RSIs. According to the characteristics of RSIs, we propose an RSANet based on regional self-attention mechanism. Our model is no longer limited by the locality of convolution, but transfers the information flow in the whole image. It can mine out the relationship between pixels in the surrounding areas, which is more logical for understanding images content. Moreover, compared with the traditional self-attention mechanism, RSANet can effectively reduce the noise of feature maps and the interference of redundant features. Our model can get better semantic segmentation results than other current models on the DroneDeploy data set and the Chreos semantic segmentation data set. The experiments show that our RSANet achieves 2% higher mean intersection over union (mIoU) than the baseline model, especially in terms of fineness, edge integrity, and classification accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call