Abstract

The progress of fine-grained visual categorization (FGVC) benefits from the application of deep neural networks, especially convolutional neural networks (CNNs), which heavily rely on large amounts of labeled data for training. However, it is hard to obtain the accurate labels of similar fine-grained subcategories because labeling needs professional knowledge, which is labor-consuming and time-consuming. Therefore, it is appealing and significant to recognize these similar fine-grained subcategories with a few labeled samples or even only one for training, which is a highly challenging task. In this paper, we propose OLOS (Only Learn One Sample), a new data augmentation approach for fine-grained visual categorization with only one sample training, and its main novelties are: (1) A 4-stage data augmentation approach is proposed to increase both the volume and variety of the one training image, which provides more visual information with multiple views and scales. It consists of a 2-stage data generation and a 2-stage data selection. (2) The 2-stage data generation approach is proposed to produce image patches relevant to the object and its parts for the one training image, as well as produce new images conditioned on the textual descriptions of the training image. (3) The 2-stage data selection approach is proposed to conduct screening on the generated images in order that useful information is remained and noisy information is eliminated. Experimental results and analyses on fine-grained visual categorization benchmark demonstrate that our proposed OLOS approach can be applied on top of existing methods, and improves their categorization performance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call