Image Retrieval Using a Deep Attention-Based Hash

Xinlu Li,Le Zou,Mengfei Xu,Thomas Weise,Zhize Wu,Fei Sun,Jiabo Xu

doi:10.1109/access.2020.3011102

Abstract

Image retrieval is becoming more and more important due to the rapid increase of the number of images on the web. To improve the efficiency of computing the similarity of images, hashing has moved into the focus of research. This paper proposes a Deep Attention-based Hash (DAH) retrieval model, which combines an attention module and a convolutional neural network to obtain hash codes with strong representability. Our DAH has the following features: The Hamming distance between the hash codes generated by similar images is small and the Hamming distance of hash codes of dissimilar images has a larger constant value. The quantitative loss from Euclidean distance to Hamming distance is minimized. DAH has a high image retrieval precision: We thoroughly compare it with ten state-of-the-art approaches on the CIFAR-10 dataset. The results show that the Mean Average Precision (MAP) of DAH reaches more than 92% in terms of 12, 24, 36 and 48 bit hash codes on CIFAR-10, which is better than what the state-of- art methods used for comparison can deliver.

Highlights

How to guarantee the efficiency and accuracy of image retrieval is a very challenging problem
EXPERIMENTS AND DISCUSSIONS In order to assess the performance of the proposed Deep Attention-based Hash (DAH) method, we perform a large number of evaluations on the CIFAR-10 dataset
We find that the attention module improves the precision of DAH over NoneA-DAH by 0.43%, 0.13%, 0.32%, and 0.42% at 12, 24, 36, and 48 bits, respectively

Summary

INTRODUCTION

How to guarantee the efficiency and accuracy of image retrieval is a very challenging problem. We present the deep attention-based hash coding method (DAH) with pairwise tag information for large-scale image retrieval. Deep Regularized Similarity Comparison Hashing (DRSCH) [23] makes use of the similarity between image pairs as a regular term while using triple loss These methods are proposed to provide a universal solution for CBIR tasks. The transformed binary codes b1, b2 over all images should be distributed uniformly in order to increase the information capacity of the hash function, which reduces the quantization loss when mapping from the Euclidean to the binary space. With this in mind, we re-defined the loss function in the original space as follows. We adopt the mini-batch gradient descent method (MBGD) to ensure smooth improvements when training the model

TRAINING

EXPERIMENTS AND DISCUSSIONS

Findings

CONCLUSION