Bag of Features vs Vector of Locally Aggregated Descriptors

Farkhunda Younas,Junaid Baber,Maheen Bakhtyar,Javeria Farooq,Tahir Mahmood

doi:10.1007/978-3-319-56991-8_10

Abstract

Image representation by set of local features are common and also state-of-the art for many applications such as image retrieval and image classification. A single image contains on average 2.5 k–3.0 k features. Searching the images based on local features are discriminative compared to global features at the cost of heavy computational overhead. Bag-of-Features (BoF), also known as bag-of-visual words, are used for feature quantization which makes searching local features feasible in very large databases at the cost of distinctiveness. Mostly, the vocabulary size in those applications is kept up-to 1 million. In this research study, we investigated the performance of Vector of Locally Aggregated Descriptors (VLAD) which is recently proposed as an alternative to BoF for different families of descriptor. The VLAD achieves similar or sometimes better performance when compared to BoF despite of limited vocabulary size. The performance of VLAD is mostly compared with BoF on gradient based descriptors in literature. In our experiments, we take gradient based descriptor, intensity based descriptor, and binary descriptor. Scale Invariant Feature Transform (SIFT), Local Intensity Order Pattern (LIOP) and BInarization of Gradient Orientation Histograms (BIGOH) are used to validate the performance of VLAD in parallel to BoF on famous benchmark dataset. VLAD outperforms BoF in gradient based family and intensity based family but non of these are feasible for binary descriptors.

Full Text