Abstract

In this paper, we present a novel supervised cross-modal hashing framework, namely Scalable disCRete mATrix faCtorization Hashing (SCRATCH). First, it utilizes collective matrix factorization on original features together with label semantic embedding, to learn the latent representations in a shared latent space. Thereafter, it generates binary hash codes based on the latent representations. During optimization, it avoids using a large $n\times n$ similarity matrix and generates hash codes discretely. Besides, based on different objective functions, learning strategy, and features, we further present three models in this framework, i.e., SCRATCH-o, SCRATCH-t, and SCRATCH-d. The first one is a one-step method, learning the hash functions and the binary codes in the same optimization problem. The second is a two-step method, which first generates the binary codes and then learns the hash functions based on the learned hash codes. The third one is a deep version of SCRATCH-t, which utilizes deep neural networks as hash functions. The extensive experiments on two widely used benchmark datasets demonstrate that SCRATCH-o and SCRATCH-t outperform some state-of-the-art shallow hashing methods for cross-modal retrieval. The SCRATCH-d also outperforms some state-of-the-art deep hashing models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call