Cross-modal hashing (CMH) has gained much attention due to its effectiveness and efficiency in facilitating efficient retrieval between different modalities. Whereas, most existing methods unconsciously ignore the hierarchical structural information of the data, and often learn a single-layer hash function to directly transform cross-modal data into common low-dimensional hash codes in one step. This sudden drop of dimension and the huge semantic gap can cause the discriminative information loss. To this end, we adopt a coarse-to-fine progressive mechanism and propose a novel Hierarchical Consensus Cross-Modal Hashing (HCCH). Specifically, to mitigate the loss of important discriminative information, we propose a coarse-to-fine hierarchical hashing scheme that utilizes a two-layer hash function to refine the beneficial discriminative information gradually. And then, the <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\ell _{2,1}$</tex-math></inline-formula> -norm is imposed on the layer-wise hash function to alleviate the effects of redundant and corrupted features. Finally, we present consensus learning to effectively encode data into a consensus space in such a progressive way, thereby reducing the semantic gap progressively. Through extensive contrast experiments with some advanced CMH methods, the effectiveness and efficiency of our HCCH method are demonstrated on four benchmark datasets. We have released the source code at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/sunyuan-cs</uri> .
Read full abstract