Abstract
Fine-grained visual classification (FGVC) aims to identify objects belonging to multiple sub-categories of the same super-category. The key to solving fine-grained classification problems is to learn discriminative visual feature representation with only subtle differences. Although previous work based on refined feature learning has made great progress, however, high-level semantic features often lack key information for fine-grained visual object nuances. How to efficiently integrate semantic information of different granularities from classification networks is a critical. In this paper, we propose Granularity-aware Distillation and Structure Modeling region Proposal Network(GDSMP-Net). Our solution integrates multi-granularity hierarchical information through a multi-granularity fusion learning strategy to enhance feature representation. In view of the inherent challenges of large intra-class differences in FGVC, a cross-layer self-distillation regularization is proposed to to strengthen the connection between high-level semantics and low-level semantics for robust multi-granularity feature learning. On this basis, we use a weakly supervised method to generate local branches, and the collaborative learning of discriminative semantics and structural semantics based on local regions, facilitating model to perceive contextual information to capture structural interactions between local semantics. Comprehensive experiments show that our method achieves state-of-the-art performance on four widely-used challenging datasets.(CUB-200-2011, Stanford Cars, FGVC-Aircraft and NA-birds).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.