Abstract
Fine-grained image categorization is still a challenging computer vision problem in recent years. Most of existing methods highly rely on massive labeled data which are scarce in many real world applications. It should also be noticed that progressive learning demands of existing data is very common today. That is, we may pay attention to more fine-grained information (like arctic tern, black tern, buttercup or tulip) in an existing data set with labels like “bird” and “flower”. It is reasonable to believe that the existing labels and model with transferable knowledge would be helpful to another related but different, fine-grained recognition task. In this context, an improved transfer deep learning approach with hierarchical multi-adversarial networks is proposed in this paper. With this approach, cross domain features are extracted by advanced deep encoders coarsely. After that, we annotate a small amount of images in the target domain, thereby creating the “active labels” which can provide instructions for adversarial learning. Then, the GAN-based hierarchical model is utilized to select cross domain categories and enhance related features so as to facilitate an effective transfer. In order to exploit useful local features, a novel adaptive attention mechanism, Region Adversarial Network (RAN) which can select attention regions during adversarial learning and generate valuable fine-grained features, is introduced in the article. We call the proposed hierarchical framework “Attentional Multi-Adversarial Networks (AMAN)” . Experimental results show that AMAN is able to augment cross domain features well-directly and build an effective classifier for fine-grained categorization in the target domain with fewer training samples and higher accuracies.
Highlights
Fine-grained image recognition task is very common in our everyday life
This framework has 4 distinguished characteristics: 1) Instead of align features by a single GAN, a hierarchical structure with multiple GANs is designed to learn transferable information coarse to fine; 2) Feature extractor and generator are divided into 2 different but related parts in our model; 3) Built in few-shot adversarial learning, generators play an important role for fine-grained classification with new and related labels in target domain, different from existing methods which are more dependent on discriminators; The generators can enforce and map useful features into target domain, suitable for the progressive transfer learning; 4) SVMs are fine-tuned and embedded in the framework, substituting the last fully-connected layer
Inspired by the region proposal network in fasterRCNN [64] and attention proposal network in RA-CNN [24], we propose region adversarial networks(RAN) which can adaptively select discriminative region attention in the picture and enhance the relative features by G mentioned in section 4.1 with different scales
Summary
Fine-grained image recognition task is very common in our everyday life. This problem is challenging for several reasons. This framework has 4 distinguished characteristics: 1) Instead of align features by a single GAN, a hierarchical structure with multiple GANs is designed to learn transferable information coarse to fine; 2) Feature extractor and generator are divided into 2 different but related parts in our model; 3) Built in few-shot adversarial learning, generators play an important role for fine-grained classification with new and related labels in target domain, different from existing methods which are more dependent on discriminators; The generators can enforce and map useful features into target domain, suitable for the progressive transfer learning; 4) SVMs are fine-tuned and embedded in the framework, substituting the last fully-connected layer.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.