Abstract
The fine-grained image classification task is about differentiating between different object classes. The difficulties of the task are large intra-class variance and small inter-class variance. For this reason, improving models’ accuracies on the task heavily relies on discriminative parts’ annotations and regional parts’ annotations. Such delicate annotations’ dependency causes the restriction on models’ practicability. To tackle this issue, a saliency module based on a weakly supervised fine-grained image classification model is proposed by this article. Through our salient region localization module, the proposed model can localize essential regional parts with the use of saliency maps, while only image class annotations are provided. Besides, the bilinear attention module can improve the performance on feature extraction by using higher- and lower-level layers of the network to fuse regional features with global features. With the application of the bilinear attention architecture, we propose the different layer feature fusion module to improve the expression ability of model features. We tested and verified our model on public datasets released specifically for fine-grained image classification. The results of our test show that our proposed model can achieve close to state-of-the-art classification performance on various datasets, while only the least training data are provided. Such a result indicates that the practicality of our model is incredibly improved since fine-grained image datasets are expensive.
Highlights
Image classification is gaining increasing attention mainly for its wide use in the Internet of Things, self-driving cars, security, medical treatment, etc
We can derive the conclusion that the bilinear deep neural network can make better use of the relationship between regional features and global features
From the derived experimental results, our proposed method obtained a better performance than B-Convolutional Neural Network (CNN), reaching a classification accuracy of 85.1%
Summary
Image classification is gaining increasing attention mainly for its wide use in the Internet of Things, self-driving cars, security, medical treatment, etc. People’s daily life has been changed by the use of computer-based automatic classification and recognition. Such usage is facing growing challenges for people who are no longer satisfied with getting coarse-grained classification results but desire finer-grained ones. Different from general object classification, which aims to distinguish basic-level categories, fine-grained image classification focuses on recognizing images that belong to the same basic category but not the same class or subcategory [1,2]. In the security domain, while monitoring vehicles passing through checkpoints, coarse-grained information.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.