Hashing has been extensively studied for large-scale cross-modal retrieval. However, there are still some unexplored issues with existing methods. First, both the initially-provided similarity graph and one-hot label vector only logically describe the yes or no information, which cannot exactly reflect the correlation between instances. Second, all bits in a binary code contribute equally to Hamming distance, which limits its discriminability to preserve refined semantics. To address these limitations, this paper presents a new method, dubbed Weighted Cross-Modal Hashing (WCMH). It introduces the topological structure of categories to enhance the logical labels, i.e., to recover the real-valued distribution of categories. Moreover, we design a weighted Hamming distance that can enlarge the range of distance values by adaptively learning the weights of different hash bits. In light of this, more discriminative information is embedded into hash codes, and the enhanced semantics can be preserved as much as possible. Extensive experiments validate the superior performance of our WCMH, in comparison with some state-of-the-art cross-modal hashing approaches.
Read full abstract