Abstract

To capture the long-range dependence of an input image for remote sensing scene (RSS) classification, in this letter, we propose a general positional context aggregation (PCA) module in deep convolutional neural networks. The PCA module is with the form of self-attention mechanism, in which two proposed blocks, the spatial context aggregation (SCA) and the relative position encoding (RPE), are used to capture the spatial-dipartite contextual aggregation information and the RPE information. Therefore, compared with the classical self-attention mechanism, global attention maps extracted by PCA not only have the advantage of regional distinction but also satisfy the translation equivariance that is proven to benefit scene classification. To demonstrate the superiority of the PCA module, we implement it on the pretrained ResNet [i.e., the so-called PCA network (PCANet)] and report the results on five popular RSS classification benchmarks. Experimental results show that the PCA module can improve the RSS classification performance significantly, and PCANet50 achieves the state-of-the-art results on these data sets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call