Abstract

Few-shot fine-grained visual classification aims to identify fine-grained concepts with very few samples, which is widely used in many fields, such as the classification of different species of birds in biological research, and the identification of car models in traffic monitoring. Compared with the common few-shot classification task, it encounters difficulties due to significant variations within each class and small gaps between different categories. To address such problems, previous studies primarily project support samples into the space of query samples and employ metric learning to classify query images into their respective categories. However, we observe that such methods are not effective in resolving inter-class variations. To overcome this limitation, we propose a new feature alignment method based on mutual mapping, which simultaneously considers the discriminative features of new samples and classes. Specifically, besides projecting support samples into the space of query samples for reducing intra-class variations, we also project query samples into the space of support samples to increase inter-class variations. Furthermore, a direct position self-reconstruction module is proposed to utilize the location information of objects and obtain more discriminative features. Extensive experiments on four fine-grained benchmarks demonstrate that our approach is competitive when compared with other state-of-the-art methods, in both 1-shot and 5-shot settings. In the case of 5-shot, our method achieved the best performance on all four datasets, with 92.11%, 85.31%, 96.09%, and 94.64% accuracies on CUB-200-2011, Stanford Dogs, Stanford Cars, and Aircaft, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call