Abstract

Image-level weakly supervised object detection (WSOD) has made significant advancements by employing multiple instance learning (MIL) as a fundamental approach. However, challenges such as instance ambiguity and part domination persist, impeding the generation of more accurate and comprehensive boundingboxes for object detection. Moreover, existing methodologies typically treat WSOD as a two-stage task, involving weakly supervised detection followed by strongly supervised retraining. Unfortunately, the direct utilization of boundingbox-level labels predicted in the former stage, irrespective of their quality, only yields marginal improvements in detection performance. In this paper, we introduce a novel attention module and a boundingbox refinement module in the initial stage. The attention module operates across both spatial and global dimensions, enhancing the saliency and discriminative characteristics of region features associated with positive samples. The boundingbox refinement module employs multiple strategies to optimize labels, with the goal of generating high-quality boundingbox-level labels for subsequent strongly supervised retraining. Furthermore, in the second stage, we propose Loss-based Label Division (LLD) and Score-guided Weight Adjustment (SWA) strategies. These strategies effectively mitigate the impact of noisy labels during the retraining phase. To validate the effectiveness of the proposed modules and strategies, comprehensive ablation experiments are conducted. Experimental results on two public benchmarks including VOC2007 and VOC2012 show that our method achieves satisfactory performance. For access to the code, models, and additional details, please visit the following repository: https://github.com/better-chao/WSOD.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call