Abstract
Fine-grained image recognition (FGIR) task is dedicated to distinguishing similar sub-categories that belong to the same super-category, such as bird species and car types. In order to highlight visual differences, existing FGIR works often follow two steps: discriminative sub-region localization and local feature representation. However, these works pay less attention on global context information. They neglect a fact that the subtle visual difference in challenging scenarios can be highlighted through exploiting the spatial relationship among different sub-regions from a global view point. Therefore, in this paper, we consider both global and local information for FGIR, and propose a collaborative teacher-student strategy to reinforce and unity the two types of information. Our framework is implemented mainly by convolutional neural network, referred to Teacher-Student Based Attention Convolutional Neural Network (T-S-ACNN). For fine-grained local information, we choose the classic multi-attention network (MA-Net) as our baseline, and propose a type of boundary constraint to further reduce background noises in the local attention maps. In this way, the discriminative sub-regions tend to appear in the area occupied by fine-grained objects, leading to more accurate sub-region localization. For fine-grained global information, we design a graph convolution based global attention network (GA-Net), which can combine extracted local attention maps from MA-Net with non-local techniques to explore spatial relationship among sub-regions. At last, we develop a collaborative teacher-student strategy to adaptively determine the attended roles and optimization modes, so as to the cooperative reinforcement of MA-Net and GA-Net. Extensive experiments on CUB-200-2011, Stanford Cars and FGVC Aircraft datasets illustrate the promising performance of our framework.© 2022 Published by Elsevier Ltd.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.