Semantic segmentation of high-resolution optical remote sensing images is an important but challenging task. To solve the problem that many semantic segmentation networks fail to efficiently utilize global and local context information to improve the segmentation performance, this paper proposes a semantic segmentation network based on sparse self-attention (SDANet) to model the global context dependencies. Specifically, the feature maps are first divided into four regions in spatial and channel dimensions, respectively, and the divided feature maps are rearranged to form new regions. Second, the position and channel self-attention operations are performed on the rearranged regions. Third, the feature maps are restored to the original combination and the position together with channel self-attention operations are performed again to obtain the output feature maps. Finally, semantic segmentation is completed based on the output feature maps. Extensive experiments conducted on the ISPRS Vaihingen dataset demonstrate that the proposed method is superior to self-attention-based DANet, CCNet, and other general semantic segmentation networks, such as FCN, Deeplabv3+, HRNet, etc.

Full Text

Published Version
Open DOI Link

Get access to 115M+ research papers

Discover from 40M+ Open access, 2M+ Pre-prints, 9.5M Topics and 32K+ Journals.

Sign Up Now! It's FREE

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call