Abstract
While deep hashing has made great progress in large-scale multimedia retrieval, most of the existing approaches under-explore the semantic correlations and neglect the effect of context-aware visual learning. In this paper, we propose a dual-stream learning framework, termed as Deep Collaborative Discrete Hashing (DCDH), which constructs a discriminative common discrete space by collaboratively incorporating the shared and individual semantics deduced from visual features and semantics. Specifically, DCDH generates context-aware representations by employing the outer product of visual embeddings and semantic encodings. To further preserve the original semantics and alleviate the class imbalance problem, we introduce the focal loss to take advantage of frequent and rare concepts. Furthermore, a common binary code space is constructed based on the joint learning of the visual representations, the context-aware representations, and the label distribution calibration. Three losses, i.e., the pairwise similarity loss, the quantization loss, and the balanced classification loss, are collaboratively optimized in the general learning framework of DCDH. Extensive experiments conducted on three large-scale benchmark datasets demonstrate the superiority of the proposed method, yielding the state-of-the-art image retrieval performance.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.