Abstract

Hashing has been widely used for approximate nearest neighbor search of high-dimensional multimedia data. In this paper, we propose a novel hash learning framework that maps high-dimensional multimodal data into a common Hamming space where the cross-modal similarity can be measured using Hamming distance. Unlike existing cross-modal hashing methods that learn hash functions in the form of numeric quantization of linear projections, the proposed hash learning algorithm encodes features' ranking properties and takes advantage of rank correlations which are known to be scale-invariant, numerically stable and highly nonlinear. Specifically, we learn two groups of subspaces jointly, one for each modality, so that the ranking orders in those subspaces maximally preserve the cross-modal similarity. Extensive experiments on realworld datasets demonstrate superiority of the proposed methods compared to state-of-the-arts.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call