Abstract

Most existing object detection algorithms are trained based upon a set of fully annotated object regions or bounding boxes, which are typically labor-intensive. On the contrary, nowadays there is a significant amount of image-level annotations cheaply available on the Internet. It is hence a natural thought to explore such "weak" supervision to benefit the training of object detectors. In this paper, we propose a novel scheme to perform weakly supervised object localization, termed object-specific pixel gradient (OPG). The OPG is trained by using image-level annotations alone, which performs in an iterative manner to localize potential objects in a given image robustly and efficiently. In particular, we first extract an OPG map to reveal the contributions of individual pixels to a given object category, upon which an iterative mining scheme is further introduced to extract instances or components of this object. Moreover, a novel average and max pooling layer is introduced to improve the localization accuracy. In the task of weakly supervised object localization, the OPG achieves a state-of-the-art 44.5% top-5 error on ILSVRC 2013, which outperforms competing methods, including Oquab et al. and region-based convolutional neural networks on the Pascal VOC 2012, with gains of 2.6% and 2.3%, respectively. In the task of object detection, OPG achieves a comparable performance of 27.0% mean average precision on Pascal VOC 2007. In all experiments, the OPG only adopts the off-the-shelf pretrained CNN model, without using any object proposals. Therefore, it also significantly improves the detection speed, i.e., achieving three times faster compared with the state-of-the-art method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.