Abstract

In this paper, a deep neural network for semantic segmentation of high-resolution remote sensing images is proposed for urban construction land classification. The network follows a high-resolution network architecture. Specifically, a directional self-attention on the paths of different resolutions is proposed, aiming to correct the directional bias caused by the attention of strip windows during the model learning, while also reducing the computational complexity, and allowing the model to improve both the accuracy and speed. At the end of the network, a distributed alignment module with spatial information is constructed to train additional learnable parameters, to adjust the biased decision boundaries through a two-stage learning strategy, and alleviate the problem of accuracy degradation due to the unbalanced training data. We tested the proposed method and compared it with the current state-of-the-art semantic segmentation methods on the Luojia-FGLC and WHDLD datasets, and the proposed one obtained the best performance. We also verified the effectiveness of each component of the network through ablation experiments. The code and model will be available at https://github.com/Zhzhyd/DWin-HRFormer.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call