Self-supervised learning-based weight adaptive hashing for fast cross-modal retrieval

Yifan Li,Qing Liao,Jiajia Zhang,Xuan Wang,Zoe L Jiang,Jian Guan,Shuhan Qi,Chengkai Huang

doi:10.1007/s11760-019-01534-0

Abstract

Due to the low storage cost and fast search speed, hashing is widely used in cross-modal retrieval. However, there still remain some crucial bottlenecks: Firstly, there are not suitable big datasets for multimodal data. Secondly, imbalance instances will affect the accuracy of the retrieval system. In this paper, we propose an end-to-end self-supervised learning-based weight adaptive hashing method for cross-modal retrieval. For the restriction of datasets, we use the self-supervised fashion to directly extract fine-grained features from labels and use them to supervise the hashing learning of other modalities. To overcome the problem of imbalance instances, we design an adaptive weight loss to flexibly adjust the weight of training samples according to their proportions. Besides these, we also use a binary approximation regularization term to reduce the regularization error. Experiments on MIRFLICKR-25K and NUS-WIDE datasets show that our method can improve 3% performance compared to other methods.

Full Text