Abstract

Zero-shot learning (ZSL) is originally designed to address the small sample size problem often encountered in computer vision by recognizing unseen object classes without any training samples. Existing ZSL models (particularly deep ones) often assume that hundreds of labelled samples are collected from each seen class. In real-world applications, this assumption tends to become invalid. Therefore, a new ZSL setting is concerned in this paper: each seen class only has few labelled samples, while each unseen class still has no samples. This is more challenging yet more useful/practical than the conventional ZSL setting. To overcome the extreme label scarcity, we choose to obtain more training samples from image search engine for data augmentation: the name of each seen class is used as the query of Google, and the top returned images can be viewed as the noisy labelled samples for this seen class. With the augmented but noisy labelled training data, a novel inductive ZSL model is proposed by formulating label noise reduction (LNR) and semantic projection learning (SPL) within a unified framework: (1) LNR aims to refine the noisy labelled samples for projection learning; (2) SPL aims to learn the projection function with the refined training data. Extensive experiments show that our ZSL model outperforms the state-of-the-art alternatives.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call