Abstract

Classifying subcategories of the same category (such as birds, cars, airplanes) is mainly to find discriminative features and accuracy regional positions in the fine-grained visual classification (FGVC). In this article, we propose to use global average pooling slice feature maps to find significant discriminative regions without complicated network designs or training operations and use Drop-Block mechanisms to solve the problem of network overfitting. Specially, we use the feature maps of the multi-branches as inputs, and average-pooling them with different sizes of convolution kernels to obtain feature maps that containing deeper and shallow information for getting more accuracy granularities. Our methods are called as More Accurate Multi-Granular Convolutional Neural Network (MAG-CNN). Compared with other networks with more complex designs, the network only requires common operations such as pooling and convolution to achieve higher accuracy. The MAG-CNN can be trained end-to-end, without any bounding-box, and our method has reached the most advanced performance on three common fine-grained image classification benchmark datasets (CUB-Birds, FGVC-Aircraft, and Stanford-Cars).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.