Abstract

Fine-Grained Visual Categorization (FGVC) is one of challenging domains in computer vision due to the subtle differences between the subordinate categories which belong to one basic category. In order to solve this problem, recent FGVC methods could localize the discriminative regions firstly and then extract fine-grained features from these regions. However, it is difficult to mine consistent and adequate discriminative regions for final recognition. In this paper, a novel framework is proposed for FGVC which is composed of three branches which extract global features, discriminative fine-grained features and complementary fine-grained features, respectively. Concretely, discriminative fine-grained features are extracted through region from class activation map (CAM) method and complementary fine-grained features are extracted through regions from feature pyramid networks (FPN) assisted with algorithm to learn the rank of regions. Experimental results have shown that the proposed method achieve better performance than several state-of-the-art models on three datasets, such as CUB-200-2011, Stanford Cars and FGVC Aircraft.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call