Abstract

With the explosion in the volume of collected high-resolution aerial image data, the development of effective image retrieval methods for remote sensing (RS) has become a popular area of research. The problem of content-based image retrieval (CBIR) for high-resolution remote sensing (HRRS) requires robust feature extraction and representation to extract high-level semantics followed by a similarity measurement method to compute the similarity between images based on these semantics. In this paper, we propose a novel sparse graph learning-based triplet network for deep metric learning-based HRRS CBIR. First, the graph representations of the aerial images are constructed via local feature descriptors and unsupervised segmentation that provide an efficient graph-structured representation of the images. Then, a novel attention graph convolutional network is developed to extract high-level semantics from the spatial patterns of the images’ segmented regions by aggregating region-node features. The attention mechanism and the pyramid pooling structure of the graph network help the proposed model extract powerful spatial features by preserving the structural information of the images while reducing the data’s dimensionality. Finally, a novel task-driven dictionary learning (TDDL) method based on triplet loss constructs a sparse metric space for the similarity measurement of the images in which the computed sparse codes for the images promote intra-class similarity between same-class images and inter-class dissimilarity between images of different classes. The proposed sparse features reduce the dimensionality of model parameters which mitigates overfitting and improves the generalization of the overall framework for deep metric learning. Our TDDL method adopts a bi-level optimization strategy so that the dictionary can be trained alongside the parameters of the graph representation network in an end-to-end fashion. Finally, we conduct extensive evaluations of the performance of our proposed architecture on real-world datasets compared to state-of-the-art models and obtain superior results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call