Abstract
With the explosion of multi-modal Web data, effective and efficient techniques are in urgent need for cross-modal data retrieval with relevant semantics. Among all the possible solutions, the hashing techniques provide compact and measurable binary representation, thus gain much attention in related research domain. To better deal with diversified real world data, we propose MSC, a novel cross-modal hashing approach based on the generalized l <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">p</sub> -norm Multiple Subgraph Combination. Specifically, by jointly considering the content similarity, the correspondence and other weak correlation among cross-modal documents, we build the intra-modal similarity with multiple affinity subgraphs, and encode the intermodal correlation with a bipartite subgraph. Then these subgraphs are combined into one multi-modal similarity graph for all the data from heterogeneous modalities, where the weights of multiple intra-modal visual similarity subgraphs are regularized by l <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">p</sub> -norm penalty. The optimal hash codes and the combination coefficients are learned simultaneously by efficient alternating optimization. The hash functions for different modalities are learned separately by utilizing nonlinear classification models, encoding the complicated semantic relations among cross-modal data. Experiments on challenging real world datasets demonstrate the advantage of our method over existing approaches.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have