Abstract

Very high-resolution (VHR) remote sensing images can provide fine but sometimes trivial ground object details; thus, the semantic labeling of VHR images is a challenging task. To improve the VHR labeling performance, spatial multiscale information and channel attention have been employed recently. However, the exploitation of global object features is still limited, which leads to the loss of capturing within-class variation from location to location. In this letter, we present a multibranch spatial-channel attention (MSCA) model to efficiently extract global dependency and combine it with multiscale and channel attention methods. In the spatial multiscale attention block, a multibranch feature fusion model is established to exploit the global relationship captured by self-attention and the multiscale correlation learned from dilated convolutions. To alleviate the computational cost of pixel-by-pixel self-attention operation, a spatial pyramid compressing method is also designed. In the channel attention block, average and max global pooling strategies are applied, respectively, in two channel attention branches to generalize global information from different perspectives. Those two blocks are then adaptively united by learnable weighting parameters. Experiments on two VHR image data sets demonstrate that the proposed network can yield better performance in comparison with state-of-the-art labeling methods tested.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call