Abstract

The semantic similarity among cross-modal data objects, e.g., similarities between images and texts, are recognized as the bottleneck of cross-modal retrieval. However, existing batch-style correlation learning methods suffer from prohibitive time complexity and extra memory consumption in handling large-scale high dimensional cross-modal data. In this paper, we propose a Cross-Modal Online Low-Rank Similarity function learning (CMOLRS) method, which learns a low-rank bilinear similarity measurement for cross-modal retrieval. We model the cross-modal relations by relative similarities on the training data triplets and formulate the relative relations as convex hinge loss. By adapting the margin in hinge loss with pair-wise distances in feature space and label space, CMOLRS effectively captures the multi-level semantic correlation and adapts to the content divergence among cross-modal data. Imposed with a low-rank constraint, the similarity function is trained by online learning in the manifold of low-rank matrices. The low-rank constraint not only endows the model learning process with faster speed and better scalability, but also improves the model generality. We further propose fast-CMOLRS combining multiple triplets for each query instead of standard process using single triplet at each model update step, which further reduces the times of gradient updates and retractions. Extensive experiments are conducted on four public datasets, and comparisons with state-of-the-art methods show the effectiveness and efficiency of our approach.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.