Abstract

Ranking hypothesis sets is a powerful concept for efficient object detection. In this work, we propose a branch&rank scheme that detects objects with often less than 100 ranking operations. This efficiency enables the use of strong and also costly classifiers like non-linear SVMs with RBF- $$\chi ^2$$ ? 2 kernels. We thereby relieve an inherent limitation of branch&bound methods as bounds are often not tight enough to be effective in practice. Our approach features three key components: a ranking function that operates on sets of hypotheses and a grouping of these into different tasks. Detection efficiency results from adaptively sub-dividing the object search space into decreasingly smaller sets. This is inherited from branch&bound, while the ranking function supersedes a tight bound which is often unavailable (except for rather limited function classes). The grouping makes the system effective: it separates image classification from object recognition, yet combines them in a single formulation, phrased as a structured SVM problem. A novel aspect of branch&rank is that a better ranking function is expected to decrease the number of classifier calls during detection. We use the VOC'07 dataset to demonstrate the algorithmic properties of branch&rank.

Highlights

  • Object class detection in images is challenging because of two problems

  • Two sets with different labels can yield the same bounding box union; the same appearance descriptor. We address this problem with a multi-task framework that connects image classification with object detection

  • Branch&rank (Lehmann et al 2011a) generalises the idea of branch&bound (Lampert et al 2009; Lehmann et al 2011b): ranking improves efficiency and thereby enables the use of arbitrary classifiers, including non-linear SVMs with RBF-χ 2 kernels. This is a crucial advance in efficient object detection since strong classifiers are beneficial to properly model the object intra-class variations

Read more

Summary

Introduction

Object class detection in images is challenging because of two problems. First, object appearances exhibit large variations due to intra-class variability, illumination changes, etc. Much progress has been made lately, manifesting itself in increasing evaluation scores of the VOC benchmark (Everingham et al 2007) during the last years These advances suggest that strong classifiers and combination of different image descriptors are required; nonlinear SVMs are constantly found to be well suited for this task (Vedaldi et al 2009; Gehler and Nowozin 2009). Such classifiers are expensive to evaluate which makes it challenging to master the large search space: sufficiently fine grained sliding window search typically requires >10 k classifier evaluations. We discuss both options separately, but note that they can be combined (Lampert 2010; Weiss et al 2010)

Objectives
Methods
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call