Abstract

Cross-modal hashing has sparked much attention in large-scale information retrieval for its storage and query efficiency. Despite the great success achieved by supervised approaches, existing unsupervised hashing methods still suffer from the lack of reliable learning guidance and cross-modal discrepancy. In this paper, we propose Aggregation-based Graph Convolutional Hashing (AGCH) to tackle these obstacles. First, considering that a single similarity metric can hardly represent data relationships comprehensively, we develop an efficient aggregation strategy that utilises multiple metrics to construct a more precise affinity matrix for learning. Specifically, we apply various similarity measures to exploit the structural information of multiple modalities from different perspectives and then aggregate the obtained information to produce a joint similarity matrix. Furthermore, a novel deep model is designed to learn unified binary codes across different modalities, where the key components include modality-specific encoders, Graph Convolutional Networks (GCNs) and a fusion module. The modality-specific encoders are tasked to learn feature embeddings for each individual modality. On this basis, we leverage GCNs to further excavate the semantic structure of data, along with a fusion module to correlate different modalities. Extensive experiments on three real-world datasets demonstrate that the proposed method significantly outperforms the state-of-the-art competitors.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call