Open zero-shot learning via asymmetric VAE with dissimilarity space

Zhibo Zhai,Xiao Li,Zhonghao Chang

doi:10.1016/j.ins.2023.119399

Abstract

Generalized Zero-Shot Learning (GZSL) aims to classify samples from seen and unseen classes using class-level features. Although existing GZSL methods have achieved remarkable progress, GZSL assumes a closed-set scenario, where both the seen and unseen classes are pre-defined. This assumption does not satisfy the complexity and variability of real-world scenarios. To address this, a new paradigm called Open Zero-Shot Learning (OZSL) has emerged, which aims to identify seen and unseen classes while rejecting the unknown ones where no semantic and visual features are available. Due to the absence of semantic features for unknown classes, the conventional methods adopted in GZSL are no longer applicable. Our research discovered that dissimilarities within the semantic space could be transferred to the visual space, achieving a shared dissimilarity space. Building upon this idea, we propose an asymmetric Variational Autoencoder (VAE) architecture. The encoders compute the dissimilarities within features and learn the latent distribution of dissimilarities, while the decoders synthesize features for unseen and unknown classes conditioned on original features and dissimilarities. We evaluate the proposed method on benchmark datasets, including AWA, CUB, SUN, and FLO, and achieve state-of-the-art OZSL performance.

Full Text