Abstract

Fine-grained image recognition is a highly challenging problem due to subtle differences between images. There are many attempts to solve fine-grained image recognition problems using data augmentation, jointly optimizing deep metric learning. CutMix is one of the excellent data augmentation strategies which crops and merges to generate new images. However, it sometimes generates meaningless and obscured object images that degrade recognition performance. We propose a novel framework that solves the above problem and expands the CutMix leveraging localizing method. Also, we improve the recognition accuracy to joint optimizing with a pairwise margin loss using generated images from the improved CutMix. There are some images similar to the reference image among the generated images. They are generated by replacing similar parts from the reference image. Those generated images should not be located much farther than the margin value in embedding space because those generated images and a reference image have similar semantic meaning. However, the conventional margin loss can not consider those images which are located much farther than the margin. To solve this problem, we propose an additional margin loss to consider those generated images. The proposed framework consists of two stages: the part localization-aware CutMix and an adaptive pairwise margin loss. The proposed method achieves state-of-the-art performance on the CUB-200-2011, FGVC-Aircraft, Stanford Cars, and DeepFashion datasets. Furthermore, extensive experiments demonstrate that each stage improves the final performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call