Abstract

Selecting local features is crucial in generating robust compact descriptors for mobile visual search. The state-of-the-art MPEG Compact Descriptors for Visual Search (CDVS) standard has utilized the intrinsic characteristics (e.g., scale, orientation, peak, center distance, etc.) of interest points to select salient local features for selective aggregation and compression of local feature descriptors at different bit rates. In particular, the statistics of center distance was considered as an important attribute to select features in mobile visual search, which heavily relies on the assumption of a centralized object in a 2-dimensional query image. However, the ad-hoc assumption would probably fail to delineate query objects in a cluttered scene. In this paper, we propose to incorporate the depth cue to select local features. As most mobile phones are not yet equipped with depth sensor, we recover the disparity of local features through an auxiliary image to fast estimate the depth of a query image. The experiments have shown that, the incorporation of depth cue into feature selection can significantly improve the retrieval performance of the state-of-the-art CDVS compact descriptors at lower bit rates. For example, the mAP is improved from 84.5% to 88.6% at 512 bytes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call