Abstract

With new social media concepts emerging, zero-shot cross-modal retrieval methods have gained significant attention. Most of the existing methods assume that the labels of training data are correct and the different modalities are perfectly matched, which is unrealistic in real-life retrieval scenarios. This paper presents a novel approach, termed unpaired robust hashing with noisy labels (URHNL), for zero-shot cross-modal retrieval. Specifically, we developed a zero-shot cross-modal hash retrieval framework that can learn distinct hash codes for different modalities, which is suitable for unpaired cross-modal retrieval scenarios. In addition, it incorporates the sparse constraint on the noise matrix and the low-rank constraint on the recovered label matrix, respectively. These constraints are applied to mitigate the negative effects of noisy labels. Furthermore, we introduce the concept of drag É› into the learning process of label semantic embedding, which aims to generate more discriminative hash codes. To improve the similarity of semantic information within hash codes, we consider both intra-modal and inter-modal similarity. A large number of experiments on cross-modal datasets show the effectiveness of the URHNL approach in real and complex zero-shot cross-modal retrieval scenarios. The source code of this word can be found at https://github.com/szq0816/URHNL.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call