Abstract
Joint learning of image representation and hash encoding represents a dominant solution to approximate nearest neighbor search for large-scale image retrieval. Despite significant advances in deep learning to hash in multi-label setting, optimization of semantic similarity-preserving representations with minimal quantization error remains challenging. Motivated by the recent success of contrastive representation learning in various vision tasks, this article introduces a Multi-label Contrastive Hashing (MCH) method for large-scale multi-label image retrieval. We extend the image similarity modeling in existing supervised contrastive loss from binary to multi-level structure, by which multi-level semantic similarity between multi-label images can be well modeled and captured in learning to hash. Despite the appealing properties of contrastive learning, directly adapting it into multi-label hashing is non-trivial, as the quantization loss may restricts the optimization of the multi-level contrastive loss, degrading the multi-level similarity-preserving hashing. To this end, we design a curriculum strategy to adaptively adjust the weight of quantization loss by leveraging the historical quantization deviations during training, such that the multi-level semantic similarity can be well preserved with progressively reduced quantization deviation. We conduct extensive experiments on three benchmark datasets including MirFlickr25k, NUS-WIDE, and IAPRTC-12. The results indicate the effectiveness of our approach, outperforming several state-of-the-art solutions for hashing-based multi-label image retrieval.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.