Abstract

Weakly supervised object detection (WSOD) has received widespread attention since it requires only image-category annotations for detector training. Many advanced approaches solve this problem by a two-phase learning framework, that is, instance mining that classifies generated proposals via multiple instance learning, and instance refinement that iteratively refines bounding boxes using the supervision produced by the preceding stage. In this paper, we observe that the detection performance is usually limited by imprecise supervision, including part domination and untight boxes. To mitigate their adverse effects, we focus on selecting high-quality proposals as the supervision for WSOD. To be specific, for the issue of part domination, we propose bottom-up aggregated attention which incorporates low-level features from shallow layers to improve location representation of top-level features. In this manner, the proposals corresponding to entire objects can get high scores. Its advantage is that it can be flexibly plugged into the WSOD framework since there is no need to attach learnable parameters or learning branches. As regards the problem of untight boxes, we propose a phase-aware loss, which is the first work to measure supervision quality by the loss in the instance mining phase, to highlight correct boxes and suppress untight ones. In this work, we unify the proposed two modules into the framework of online instance classifier refinement. Extensive experiments on the PASCAL VOC and the MS COCO demonstrate that our method can significantly improve the performance of WSOD and achieve the state-of-the-art results. The code is available at https://github.com/Horatio9702/BUAA_PALoss.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call