Abstract

Zero-shot learning and self-supervised learning have been widely studied due to the advantage of performing representation learning in a data shortage situation efficiently. However, few studies consider zero-shot learning using semantic embeddings (e.g., CNN features or attributes) and self-supervision simultaneously. The reason is that most zero-shot learning works employ vector-level semantic embeddings. However, most self-supervision studies only consider image-level domains, so a novel self-supervision method for vector-level CNN features is needed. We propose a simple way to shuffle semantic embeddings. Furthermore, we propose a method to enrich feature representation and improve zero-shot learning performance effectively. We show that our model outperforms current state-of-the-art methods on the large-scale ImageNet 21K and the small-scale CUB and SUN datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call