Abstract

Constructing discriminative feature descriptors is crucial towards effective image retrieval. The state-of-the-art powerful global descriptor for this purpose is Vector of Locally Aggregated Descriptors (VLAD). Given a set of local features (say, SIFT) extracted from an image, the VLAD is generated by quantizing local features with a small visual vocabulary (64 to 512 centroids), aggregating the residual statistics of quantized features for each centroid and concatenating the aggregated residual vectors from each centroid. One can increase the search accuracy by increasing the size of vocabulary (from hundreds to hundreds of thousands), which, however, it leads to heavy computation cost with flat quantization. In this paper, we propose a hierarchical multi-VLAD to seek the tradeoff between descriptor discriminability and computation complexity. We build up a tree-structured hierarchical quantization (TSHQ) to accelerate the VLAD computation with a large vocabulary. As quantization error may propagate from root to leaf node (centroid) with TSHQ, we introduce multi-VLAD, which constructing a VLAD descriptor for each level of the vocabulary tree, so as to compensate for the quantization error at that level. Extensive evaluation over benchmark datasets has shown that the proposed approach outperforms state-of-the-art in terms of retrieval accuracy, fast extraction, as well as light memory cost.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.