Abstract
Zero-shot learning aims to recognize unseen-classes using some seen-class samples as training set. It is challenging owing to that the feature representations of unseen-class samples are unavailable. Existing methods transfer the mapping from seen-classes to unseen-classes with the correlation as a bridge, in which, the semantic representations are used to discriminate the classes. However, the unavailability of visual representations for unseen-classes and the insufficient discrimination of semantic representations make the zero-shot learning challenging. Therefore, the visual representations are learned as complements to semantic representations to construct a multi-modal knowledge graph (KG), and a zero-shot learning method based on multi-modal KG is proposed in this paper. Specially, a semantic KG is introduced to capture the correlation of classes, and with the correlation, the visual feature representations of all classes are learned. Then, the discriminative visual representations and the semantic representations are used together to construct a multi-modal KG. With the multi-modal KG, the classifier for seen-classes is transferred to unseen classes. Extensive experimental results show the effectiveness of our method.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have