Abstract

Deep hashing methods have achieved tremendous success in cross-modal retrieval, due to its low storage consumption and fast retrieval speed. Supervised cross-modal hashing methods have achieved substantial advancement by incorporating semantic information. However, to a great extent, supervised methods rely on large-scale labeled cross-modal training data which are laborious to obtain. Moreover, most cross-modal hashing methods only handle two modalities of image and text, without taking the scene of multiple modalities into consideration. In this paper, we propose a novel semi-supervised approach called semi-supervised knowledge distillation for cross-modal hashing (SKDCH) to overcome the above-mentioned challenges, which enables guiding a supervised method using outputs produced by a semi-supervised method for multimodality retrieval. Specifically, we utilize teacher-student optimization to propagate knowledge. Furthermore, we improves triplet ranking loss to better mitigate the heterogeneity gap, which increases the discriminability of our proposed approach. Extensive experiments executed on two benchmark datasets validate that the proposed SKDCH surpasses the state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call