Abstract

Zero-shot learning (ZSL) aims to recognize objects in images when no training data is available for the object classes. Under generalized zero-shot learning (GZSL) setting, the test objects belong to seen or unseen categories. In many recent studies, zero-shot learning is performed by leveraging generative networks to synthesize visual features for unseen class from class-specific semantic features. The user-defined semantic information is incomplete and lack of discriminability. However, most generative methods use user-defined semantic information directly as constraints of the generative model, which makes the visual features synthesized by the models lack of diversity and separability. In this paper, we propose a novel method to improve the semantic feature by utilizing discriminative visual features. Furthermore, a novel Augmented Semantic Feature Based Generative Network (ASFGN) is built to synthesize the separable visual representations for unseen classes. Since GAN-based generative model may suffer from mode collapse, we propose a novel collapse-alleviate loss to improve the training stability and generalization performance of our generative network. Extensive experiments on four benchmark datasets prove that our method outperforms the state-of-art approaches in both ZSL and GZSL settings.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call