Abstract

Learning to recognize unseen objects is the goal of zero-shot learning (ZSL), building on transferring the class-level semantic descriptions. Previous methods devote to bridging the instance-level objects with class-level semantics through feature generation or co-embedding, neglecting prototype-level and distribution-level associations, which is not conducive to narrowing the visual-semantic gap. This paper yields a novel prototype rectification framework for ZSL, termed PRZSL, which is dedicated to learning and calibrating the dual prototype distributions in a meta-domain. We first propose a contrastive embedding module with a compatibility loss and an angular loss to make inter-class prototypes well-separated. We further collaboratively rectify the dual prototypes by injecting the prototype distribution information of another modality, boosting the visual-semantic alignment at the distribution level. Unlike previous methods that anchor the semantic position, semantic prototypes also participate in collaborative updates, thereby promoting alignment from semantic to vision. Comprehensive experimental results on five zero-shot benchmarks demonstrate that our proposed method can achieve competitive performance compared with the state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call