Abstract
Weakly supervised object localization (WSOL) aims to localize objects with only image-level labels. As a common WSOL method, adversarial erasing always masks the most discriminative region in the feature space to compel the network to localize more regions of the object. However, with the discriminative region vanishing, the localizer is confused when distinguishing the regions of object from the background. In this paper, we propose a new feature disparity learning (FDL), which encourages the network to learn more distinctive features from the object region with similarity measurement after feature enhancement. Specifically, we first introduce a Spatial Vector Cross Attention (SVCA) module. This module enhances responses in less discriminative region of erased feature maps by reintegrating the spatial distribution of features through the capture of interdependencies among spatial vectors on each channel. Furthermore, we propose a feature complementarity loss to measure the similarity between unerased features and erased features, guiding the network to learn feature disparities caused by adversarial erasing for improved localization and classification. Several experimental studies demonstrate a significant increase in localization performance over the existing state-of-the-art erasing methods on the CUB 200–2011 and ILSVRC 2016 datasets.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.