BIG-OH: BInarization of gradient orientation histograms

Junaid Baber,Matthew N Dailey,Shin'Ichi Satoh,Nitin Afzulpurkar,Maheen Bakhtyar

doi:10.1016/j.imavis.2014.08.006

Abstract

Extracting local keypoints and keypoint descriptions from images is a primary step for many computer vision and image retrieval applications. In the literature, many researchers have proposed methods for representing local texture around keypoints with varying levels of robustness to photometric and geometric transformations. Gradient-based descriptors such as the Scale Invariant Feature Transform (SIFT) are among the most consistent and robust descriptors. The SIFT descriptor, a 128-element vector consisting of multiple gradient histograms computed from local image patches around a keypoint, is widely considered as the gold standard keypoint descriptor. However, SIFT descriptors require at least 128bytes of storage per descriptor. Since images are typically described by thousands of keypoints, it may require more space to store the SIFT descriptors for an image than the original image itself. This may be prohibitive in extremely large-scale applications and applications on memory-constrained devices such as tablets and smartphones. In this paper, with the goal of reducing the memory requirements of keypoint descriptors such as SIFT, without affecting their performance, we propose BIG-OH, a simple yet extremely effective method for binary quantization of any descriptor based on gradient orientation histograms. BIG-OH's memory requirements are very small—when it uses SIFT's default parameters for the construction of the gradient orientation histograms, it only requires 16bytes per descriptor. BIG-OH quantizes gradient orientation histograms by computing a bit vector representing the relative magnitudes of local gradients associated with neighboring orientation bins. In a series of experiments on keypoint matching with different types of keypoint detectors under various photometric and geometric transformations, we find that the quantized descriptor has performance comparable to or better than other descriptors, including BRISK, CARD, BRIEF, D-BRIEF, SQ, and PCA-SIFT. Our experiments also show that BIG-OH is extremely effective for image retrieval, with modestly better performance than SIFT. BIG-OH's drastic reduction in memory requirements, obtained while preserving or improving the image matching and image retrieval performance of SIFT, makes it an excellent descriptor for large image databases and applications running on memory-constrained devices.

Full Text