Abstract

With the rapid development of deep learning, the performance of fine-grained image classification has experienced unprecedented improvement. However, for fine-grained image classification, quickly and effectively focusing on subtle discriminative details that make the sub-classes different from each other has always been challenging. In this paper, we propose a novel Multi-Scale Erasure and Confusion (MSEC) method to tackle the challenge of fine-grained image classification. Firstly, the input image is divided into several sub-regions, and the confidence scores of those sub-regions are calculated by the confidence function. The sub-regions with lower confidence scores are then erased by the Region Erasure Module (REM) and the erased image is confused once by the Multi-scale Region Confusion Module (Multi-scale RCM). Secondly, the sub-regions with higher confidence scores are divided and confused again by the Multi-scale RCM, and then generate an image with multi-scale information. Finally, features in the erased image and the “destructed” image are extracted by the backbone network, and the whole network is optimized by the multi-loss function to realize classification tasks. Extensive experiments on three standard fine-grained benchmark datasets, including Stanford Dogs, CUB-200-2011 and FGVC-Aircraft, show that MSEC can improve the accuracy of fine-grained image classification.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.