Abstract
Hashing methods have been widely used for approximate nearest neighbor search in recent years due to its computational and storage effectiveness. Most existing multimodal hashing methods try to preserve the similarity relationship based on either metric distances or semantic labels in a procrustean way, while ignoring the intra-class and inter-class variations inherent in the metric space. In this paper, we propose a novel multimodal hashing method, termed as semantic neighbor graph hashing (SNGH), which aims to preserve the fine-grained similarity metric based on the semantic graph that is constructed by jointly pursuing the semantic supervision and the local neighborhood structure. Specifically, the semantic graph is constructed to capture the local similarity structure for the image modality and the text modality, respectively. Furthermore, we define a function based on the local similarity in particular to adaptively calculate multi-level similarities by encoding the intra-class and inter-class variations. After obtaining the unified hash codes, the logistic regression with kernel trick is employed to learn view-specific hash functions independently for each modality. Extensive experiments are conducted on four widely used multimodal data sets. The experimental results demonstrate the superiority of the proposed SNGH method compared with the state-of-the-art multimodal hashing methods.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have