Abstract

The success of deformable part-based models (DPMs) for visual object detection relies on a large number of labeled bounding boxes. With only image-level annotations, our goal is to propose a model enhancing the weakly supervised DPMs by emphasizing the importance of location and size of the initial class-specific root filter. To adaptively select a discriminative set of candidate bounding boxes as this root filter estimate, first, we explore the generic objectness measurement to combine the most salient regions and “good” region proposals. Second, we propose learning of the latent class label of each candidate window as a binary classification problem, by training category-specific classifiers used to coarsely classify a candidate window into either a target object or a nontarget class. Finally, we design a flexible enlarging-and-shrinking postprocessing procedure to modify the DPMs outputs, which can effectively match the approximative object aspect ratios and further improve final accuracy. Extensive experimental results on the challenging PASCAL Visual Object Class 2007 and the Microsoft Common Objects in Context 2014 dataset demonstrate that our proposed framework is effective for initialization of the DPM's root filter. It also shows competitive final localization performance with state-of-the-art weakly supervised object detection methods, particularly for the object categories that are relatively salient in the images and deformable in structures.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call