Abstract

Fine-grained visual classification tasks often suffer from that the subordinate categories within a basic-level category have low inter-class discrepancy and high intra-class variances, which is still challenging research for traditional deep neural networks (DNNs). However, different models extract local parts’ features in isolation and neglect the inherent correlations and distribution in high-dimensional space, which limit the single model to achieve better accuracy. In this paper, we propose a novel probability fusion decision framework (named as PFDM-Net) for fine-grained visual classification. More specifically, it first employs data-augmented tricks to enlarge the dataset and pretrain the basic VGG19 and ResNet networks on high-quality images datasets to learn common and domain knowledge simultaneously while fine-tuning with professional skill. Next, refined multiple DNNs with transfer learning are applied to design a multi-stream feature extractor, which utilizes the mixture-granularity information to exploit high-dimensionality features for distinguishing interclass discrepancy and tolerating intra-class variances. Finally, a probability fusion module equipped with gating network and probability fusion layer is developed to fuse different components model with Gaussian distribution as a unified probability representation for the ultimate fine-grained recognition. The input of this module is the various features of multi-models and the output is the fused classification probability. The end-to-end implementation of our framework contain an inner loop about the EM algorithm within an outer loop with the gradient back-propagation optimization of the whole network. Experimental results demonstrate the outperforming performance of PFDM-Net with higher classification accuracy on different fine-grained datasets compared with the state-of-the-arts methods. More discussions are provided to indicate the potential applications in combination with other work.

Highlights

  • Deep neural networks (DNNs) are the most important research branches in machine learning

  • Thanks to the breakthroughs in the design and training of DNNs with complex structures consisting of multiple processing layers or nonlinear transformations, unprecedented improvements have penetrated into many aspects of artificial intelligence, especially the performances of visual classification on large-scale

  • In this paper, we design a novel probability fusion decision framework named as PFDM-Net for fine-grained visual classification tasks

Read more

Summary

INTRODUCTION

Deep neural networks (DNNs) are the most important research branches in machine learning. To distinguish fine-grained categories with very similar outline, it requires specialized knowledge focusing on feature representation of discriminative object parts to expand the application of existing DNNs on FGVC. Given the learned location features of objects’ parts, single WSL model is likely to focus on the constant architecture of parts distribution and lack the capability to distinguish interclass variances between similar fine-grained classes. The end-to-end implementation of fusion module has an inner loop about the expectation maximization algorithm (EM algorithm) used in fusion layer and an outer loop about backward gradient propagation of the whole network in the training process This optimization design offers a high capacity of learning complementary yet correlated information for intra-class variances among multi-grained feature maps of different models, which make the proposed PFDM-Net more suitable for identifying slight discrepancy when distinguishing the fine-grained categories in FGVC tasks.

RELATED WORK
Findings
DISCUSSION
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call