Abstract

Image retrieval or content-based image retrieval (CBIR) can be transformed into the calculation of the distance between image feature vectors. The closer the vectors are, the higher the image similarity will be. In the image retrieval system for large-scale dataset, the approximate nearest-neighbor (ANN) search can quickly obtain the top k images closest to the query image, which is the Top-k problem in the field of information retrieval. With the traditional ANN algorithms, such as KD-Tree, R-Tree, and M-Tree, when the dimension of the image feature vector increases, the computing time will increase exponentially due to the curse of dimensionality. In order to reduce the calculation time and improve the efficiency of image retrieval, we propose an ANN search algorithm based on the Product Quantization Table (PQTable). After quantizing and compressing the image feature vectors by the product quantization algorithm, we can construct the image index structure of the PQTable, which speeds up image retrieval. We also propose a multi-PQTable query strategy for ANN search. Besides, we generate several nearest-neighbor vectors for each sub-compressed vector of the query vector to reduce the failure rate and improve the recall in image retrieval. Through theoretical analysis and experimental verification, it is proved that the multi-PQTable query strategy and the generation of several nearest-neighbor vectors are greatly correct and efficient.

Highlights

  • With the rapid development of mobile internet and social multimedia, images and videos are growing explosively every day

  • This section detailly introduces the multi-product quantization table (PQTable) for approximate nearest-neighbor (ANN) search

  • The results and search based on the multi-PQTable algorithm by the mean average precision, the precision analysis of the experiment are as follows: 1

Read more

Summary

Introduction

With the rapid development of mobile internet and social multimedia, images and videos are growing explosively every day. Information 2019, 10, 190 when dealing with high-dimensional image feature vectors Their performance is not even as good as that of linear search [21]. The locally sensitive hashing (LSH) algorithm solves the problem of high-dimensional vector search from another angle. If the hashing algorithm wants to obtain a high retrieval accuracy, the length of the hash code needs to be long enough This will reduce the collision probability of similar samples during random transformation and reduce the recall rate. We propose a product quantization table (PQTable) algorithm on the basis of the PQ algorithm, according to the ability of the Hash Table to quickly find the required content This algorithm can implement a non-exhaustive approximate nearest-neighbor search algorithm, aiming at quickly and accurately retrieving the vector candidate sets in a large-scale dataset.

Product Quantization
Multi-PQTable for ANN Search
Problem Description
The image retrieval task
PQTable
Multi-PQTable Query Strategy
The Multi-PQTable Query Strategy
Experiments and Analysis
Experimental Settings
The construction
Ni p a p n
Experimental
Precision
Parameter Settings
Image Retrieval Examples
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call