Abstract

For image classification in computer vision, the performance of conventional deep neural networks (DNN) may usually drop when labeled training samples are limited. In this case, few-shot learning (FSL) or particularly zero-shot learning (ZSL), i.e. classification of target classes with few or zero labeled training samples, was proposed to imitate the strong learning ability of human. However, recent investigations show that most existing ZSL models may easily overfit and they tend to misclassify the target instance as one class seen in the training set. To alleviate this problem, we proposed an embedding based ZSL method with a self-focus mechanism, i.e. a focus-ratio that introduces the importance of each dimension, into the model optimization process. The objective function will be reconstructed according to these focus-ratios encouraging that the embedding model focus exclusively on important dimensions in the target space. As the self-focus module only takes part in the training process, the over-fitting knowledge is apportioned, and hence the rest embedding model can become more generalized for the new classes during test. Experimental results on four benchmarks, including AwA1, AwA2, aPY and CUB, show that our method outperforms the state-of-the-art methods on coarse-grained ZSL tasks while not affecting the performance of fine-grained ZSL. Additionally, several comparisons demonstrate the superiority of the proposed mechanism.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call