Abstract

Current feature-based object recognition methods use information derived from local image patches. For robustness, features are engineered for invariance to various transformations, such as rotation, scaling, or affine warping. When patches overlap object boundaries, however, errors in both detection and matching will almost certainly occur due to inclusion of unwanted background pixels. This is common in real images, which often contain significant background clutter, objects which are not heavily textured, or objects which occupy a relatively small portion of the image. We suggest improvements to the popular scale invariant feature transform (SIFT) which incorporate local object boundary information. The resulting feature detection and descriptor creation processes are invariant to changes in background. We call this method the background and scale invariant feature transform (BSIFT). We demonstrate BSIFT's superior performance in feature detection and matching on synthetic and natural images.

Highlights

  • Feature-based methods are commonly used for object recognition

  • The orientations are put into a histogram weighted by their magnitudes and by distance from the patch center in order to determine a dominant orientation for each interest point. (Note that, in practice, multiple peaks in this orientation histogram may result in multiple interest points at the same location, but with differing orientations.) At this point, without any modification, background invariance is violated since we are working with a patch of image values which will often overlap object and background pixels

  • To add background invariance to this standard Scale Invariant Feature Transform (SIFT) descriptor, a boundary-respecting weighting mask produced by fast marching is again used to weight each sample’s histogram contribution according to its distance from the interest point

Read more

Summary

Introduction

Feature-based methods are commonly used for object recognition. Such approaches seek to efficiently match objects from a database to those seen in novel images using a sparse set of information-rich features extracted from images. Because all of these methods rely on the use of local image information at various scales, features whose descriptors overlap the object and the background will incorporate information from both. We would like to incorporate objectbackground (often called figure-ground) separation into the detection and description processes in order to achieve such invariance Knowledge of this separation could be obtained from various sources, including stereo disparity [3], motion cues [4], local or global segmentation schemes [26, 23], simple background subtraction, or a combination of these methods [19]. Edge-based features have emerged which exhibit some degree of background invariance in order to recognize wiry shapes in cluttered scenes [6, 17, 10] Such methods build local features from edge maps in order to capture shape rather than texture.

Background
Synthetic Results
Results from Real Data
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.