Abstract
In this empirical study, we evaluate the impact of the dimensions’ value cardinality (DVC) of image descriptors in each dimension, on the performance of large-scale similarity search. DVCs are inherent characteristics of image descriptors defined for each dimension as the number of distinct values of image descriptors, thus expressing the dimension’s discriminative power. In our experiments, with six publicly available datasets of image descriptors of different dimensionality (64–5,000 dim) and size (240 K–1 M), (a) we show that DVC varies, due to the existence of several extraction methods using different quantization and normalization techniques; (b) we also show that image descriptor extraction strategies tend to follow the same DVC distribution function family; therefore, similarity search strategies can exploit image descriptors DVCs, irrespective of the sizes of the datasets; (c) based on a canonical correlation analysis, we demonstrate that there is a significant impact of image descriptors’ DVCs on the performance of the baseline LSH method [8] and three state-of-the-art hashing methods: SKLSH [28], PCA-ITQ [10], SPH [12], as well as on the performance of MSIDX method [34], which exploits the DVC information; (d) we experimentally demonstrate the influence of DVCs in both the sequential search and in the aforementioned similarity search methods and discuss the advantages of our findings. We hope that our work will motivate researchers for considering DVC analysis as a tool for the design of similarity search strategies in image databases.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Multimedia Information Retrieval
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.