Abstract

Fine-grained visual classification (FGVC) strives to distinguish images from distinct sub-classes within the same overarching meta-class, which is significant in various practical applications. Existing works mainly employ attention mechanisms to learn discriminative feature representations of objects under weakly supervised learning. In this paper, we argue that this likehood-based attention learning manner often outputs an inadequate feature representation since the available image-level labels fail to provide an explicit supervisory signal for attention learning, especially when the fine-grained images share a small and inconsistent inter-class variance. To alleviate this issue, we consider this challenging task from the perspective of attacking the feature representation between similar sub-classes to maximize the feature discriminativeness via learning adversarial examples, and propose an Adversarial-Aware Fine-Grained Visual Classification Network (A2Net). Specifically, we first propose an adversarial attack module based on projected gradient descent, which appends multiple-scale adversarial perturbations to simulate subclass examples with different similarities. Then, we introduce an adversarial attention generation module that estimates the effect of attention learned on adversarial examples and legitimate examples for the final class prediction through causal inference. The adversarial attention generation module is encouraged to maximize the effect, which provides powerful supervision to capture more attention indicating the discriminative parts. We further propose an adversarial-aware module to learn the feature-level differences between legitimate and adversarial examples, which helps enhance the semantic boundaries of class-specific features for accurate FGVC. The extensive array of experiments conducted serves to underscore the efficacy of the proposed A2Net, outperforming state-of-the-art FGVC methods on CUB-200–2011, FGVC-Aircraft, Stanford Cars, Stanford Dogs, and NABirds benchmarks.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.