Abstract

Cross-media retrieval is an imperative approach to handle the explosive growth of multimodal data on the web. However, existing approaches to cross-media retrieval are computationally expensive due to high dimensionality. To efficiently retrieve in multimodal data, it is essential to reduce the proportion of irrelevant documents. In this paper, we propose a fast cross-media retrieval approach (FCMR) based on locality-sensitive hashing (LSH) and neural networks. One modality of multimodal information is projected by LSH algorithm to cluster similar objects into the same hash bucket and dissimilar objects into different ones and then another modality is mapped into these hash buckets using hash functions learned through neural networks. Once given a textual or visual query, it can be efficiently mapped to a hash bucket in which objects stored can be near neighbors of this query. Experimental results show that, in the set of the queries’ near neighbors obtained by the proposed method, the proportions of relevant documents can be much boosted, and it indicates that the retrieval based on near neighbors can be effectively conducted. Further evaluations on two public datasets demonstrate the efficacy of the proposed retrieval method compared to the baselines.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call