Abstract

Existing solutions for weakly supervised object detection (WSOD) generally follow the multiple instance learning (MIL) paradigm to formulate WSOD as a multi-class classification problem over a set of region proposals. However, without the supervision signal of ground-truth boxes, the training objective of multi-class classification makes the detectors devote main efforts to finding the most common pattern of each class, as the common pattern is always the most discriminative evidence for classification. In addition, although learning from distinguishing multiple foreground classes, the detectors can still ignore to differentiate foreground regions from the background ones, which causes false alarm in prediction. These two points account for the limited localization capability of MIL-based WSOD methods. To this end, we propose foreground information guided WSOD (FI-WSOD), a novel framework that introduces an extra foreground-background binary classification (F-BBC) sub-task to the original MIL-based WSOD paradigm. Particularly, the proposed FI-WSOD ameliorates object detectors at both training and inference stages. At the training stage, the involvement of F-BBC task not only improves the feature representation of the network, but also provides extra information from the foreground-background perspective. By leveraging the learnt foreground information, a Foreground Guided Self-Training (FGST) module is further proposed to filter out noisy samples, and to mine representative seeds from the remaining proposals. Moreover, a Multi-Seed Training strategy is performed to reduce the impact of noisy labels when training the self-training networks in FGST. At the inference stage, the F-BBC results are utilized to update the initial multi-class classification scores, which are further integrated with FGST results for better performance. We have conducted extensive experiments on the prevalent Pascal VOC 2007, Pascal VOC 2012 and MSCOCO datasets, and report a series of state-of-the-art records achieved by our proposed framework.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call