Cross-modal hashing (CMH) has attracted considerable attention in recent years. Almost all existing CMH methods primarily focus on reducing the modality gap and semantic gap, i.e., aligning multi-modal features and their semantics in Hamming space, without taking into account the space gap, i.e., difference between the real number space and the Hamming space. In fact, the space gap can affect the performance of CMH methods. In this paper, we analyze and demonstrate how the space gap affects the existing CMH methods, which therefore raises two problems: solution space compression and loss function oscillation. These two problems eventually cause the retrieval performance deteriorating. Based on these findings, we propose a novel algorithm, namely Semantic Channel Hashing (SCH). First, we classify sample pairs into fully semantic-similar, partially semantic-similar, and semantic-negative ones based on their similarity and impose different constraints on them, respectively, to ensure that the entire Hamming space is utilized. Then, we introduce a semantic channel to alleviate the issue of loss function oscillation. Experimental results on three public datasets demonstrate that SCH outperforms the state-of-the-art methods. Furthermore, experimental validations are provided to substantiate the conjectures regarding solution space compression and loss function oscillation, offering visual evidence of their impact on the CMH methods.
Read full abstract