Bidirectional generative transductive zero-shot learning

Xinpeng Li,Dan Zhang,Mao Ye,Qiang Dou,Qiao Lv,Xue Li

doi:10.1007/s00521-020-05322-7

Abstract

Most zero-shot learning (ZSL) methods aim to learn a mapping from visual feature space to semantic feature space or from both visual and semantic feature spaces to a common joint space and align them. However, in these methods the visual and semantic information are not utilized sufficiently and the useless information is not excluded. Moreover, there exists a strong bias problem that the instances from unseen classes always tend to be predicted as some seen classes in most ZSL methods. In this paper, combining the advantages of generative adversarial networks (GANs), a method based on bidirectional projections between the visual and semantic feature spaces is proposed. GANs are used to perform bidirectional generations and alignments between the visual and semantic features. In addition, cycle mapping structure ensures that the important information are kept in the alignments. Furthermore, in order to better solve the bias problem, pseudo-labels are generated for unseen instances and the model is adjusted along with them iteratively. We conduct extensive experiments at traditional ZSL and generalized ZSL settings, respectively. Experiment results confirm that our method achieves the state-of-the-art performances on the popular datasets AWA2, aPY and SUN.

Full Text