Weakly supervised object localization (WSOL), adopting only image-level annotations to learn the pixel-level localization model, can release human resources in the annotation process. Most one-stage WSOL methods learn the localization model with multi-instance learning, making them only activate discriminative object parts rather than the whole object. In our work, we attribute this problem to the domain shift between the training and test process of WSOL and provide a novel perspective that views WSOL as a domain adaption (DA) task. Under this perspective, a DA-WSOL pipeline is elaborated to better assist WSOL with DA approaches by considering the specificities for the adaption of WSOL. Our DA-WSOL pipeline can discern the source-related and the Universum samples from other target samples based on a proposed target sampling strategy and then utilize them to solve the sample unbalancing and label unmatching between the source and target domain of WSOL. Experiments show that our pipeline outperforms SOTA methods on three WSOL benchmarks and can improve the performance of downstream weakly supervised semantic segmentation tasks.
Read full abstract