Abstract

Feature coding and pooling as a key component of image retrieval have been widely studied over the past several years. Recently sparse coding with max-pooling is regarded as the state-of-the-art for image classification. However there is no comprehensive study concerning the application of sparse coding for image retrieval. In this paper, we first analyze the effects of different sampling strategies for image retrieval, then we discuss feature pooling strategies on image retrieval performance with a probabilistic explanation in the context of sparse coding framework, and propose a modified sum pooling procedure which can improve the retrieval accuracy significantly. Further we apply sparse coding method to aggregate multiple types of features for large-scale image retrieval. Extensive experiments on commonly-used evaluation datasets demonstrate that our final compact image representation improves the retrieval accuracy significantly.

Highlights

  • Most state-of-the-art image retrieval approaches rely on bag-of-words (BoW) framework and its variants [1,2,3] based on local descriptors

  • The BoW model makes it possible to be used for image quantization and the TF-IDF inverted indexing structure originated from web text search are applied to find the closest image in the database, followed by a re-ranking of the result list based on geometric considerations. It suffers from visual word ambiguity, feature quantization error and memory constraints. Another promising image retrieval approach is proposed by aggregating local descriptors on one image into a compact vector using fisher vector (FV) [4] or Vector of Local Aggregated Descriptor (VLAD) [5,6]

  • We utilize Zurich and University of Kentucky Benchmark dataset (UKB) datasets to evaluate the effects of sampling strategies

Read more

Summary

Introduction

Most state-of-the-art image retrieval approaches rely on bag-of-words (BoW) framework and its variants [1,2,3] based on local descriptors. The BoW model makes it possible to be used for image quantization and the TF-IDF inverted indexing structure originated from web text search are applied to find the closest image in the database, followed by a re-ranking of the result list based on geometric considerations. It suffers from visual word ambiguity, feature quantization error and memory constraints. Our third contribution is that we exploit sparse coding method to aggregate multiple types of features for large-scale image retrieval.

Related Work
A New Modified Sum Pooling Method
Experiments
Evaluation Datasets
Different Sampling Results with Experiment Verification
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.