The amount of multi-modal data available on the Internet is enormous. Cross-modal hash retrieval maps heterogeneous cross-modal data into a single Hamming space to offer fast and flexible retrieval services. However, existing cross-modal methods mainly rely on the feature-level similarity between multi-modal data and ignore the relationship between relative rankings and label-level fine-grained similarity of neighboring instances. To overcome these issues, we propose a novel <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">D</u> eep <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">C</u> ross-modal <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">H</u> ashing based on <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">S</u> emantic <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">C</u> onsistent <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">R</u> anking (DCH-SCR) that comprehensively investigates the intra-modal semantic similarity relationship. Firstly, to the best of our knowledge, it is an early attempt to preserve semantic similarity for cross-modal hashing retrieval by combining label-level and feature-level information. Secondly, the inherent gap between modalities is narrowed by developing a ranking alignment loss function. Thirdly, the compact and efficient hash codes are optimized based on the common semantic space. Finally, we use the gradient to specify the optimization direction and introduce the Normalized Discounted Cumulative Gain (NDCG) to achieve varying optimization strengths for data pairs with different similarities. Extensive experiments on three real-world image-text retrieval datasets demonstrate the superiority of DCH-SCR over several state-of-the-art cross-modal retrieval methods.
Read full abstract