Abstract

Fine-grained visual classification (FGVC) aims to identify objects belonging to multiple sub-categories of the same super-category (such as species of birds, models of cars and aircraft). The key to solving fine-grained classification problems is to learn discriminative visual feature representation with only subtle differences. Although previous work based on refined feature learning has made great progress, however, high-level semantic features often lack key information for fine-grained visual object nuances. How to efficiently integrate semantic information of different granularities from classification networks is a Critical. In this paper, we propose Multi-Granularity Feature Distillation Learning Network (MGFDL-Net). Our solution integrates multi-granularity hierarchical information through a multi-granularity fusion learning strategy to enhance feature representation. In view of the inherent challenges of large intraclass differences in FGVC, a cross-layer self-distillation regularization is proposed to strengthen the connection between high-level semantics and low-level semantics for robust multi-granularity feature learning. Comprehensive experiments show that our method achieves state-of-the-art performance on three challenging fine-grained visual classification FGVC datasets (CUB-200-2011, Stanford Cars and FGVC-Aircraft).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call