Abstract

Remote sensing images have rich content, and then the features extracted by the general depth model are easily interfered by the complex background. The key features cannot be extracted well, and it is difficult to express the spatial information of the image. A deep convolutional neural network based on multi-scale pooling and norm attention mechanism is proposed, which adaptively weights salient features at the channel level and the spatial level. First, in the multi-scale pooling channel attention module, the max pooling of different scales is performed on the feature map of each channel based on spatial pyramid pooling. Next, the feature maps of different sizes are transformed to a uniform size by adaptive average pooling. Thus the salient features of different scales can be paid attention by element-wise addition. Then, in the norm spatial attention module, the pixels corresponding to the same spatial position of each channel are formed into vectors, and the feature map with spatial information is obtained by calculating the L1 norm and L2 norm of the vector group. Finally, the cascaded pooling method is adopted to optimize the high-level features, and the high-level features are used for remote sensing image retrieval. Experiment are conducted on UC Merced data set, AID data set and NWPU-RESISC45 data set. The results show that the proposed attention model improves the retrieval performance by concerning the salient features of different scales and combining the spatial information.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call