Abstract

Zero-shot learning (ZSL) facilitates the transfer of knowledge from seen to unseen categories through high-dimensional vectors that capture both known and unknown class names. However it encounters challenges with domain shift arising from a lack of sufficient labeled data. Although transductive zero-shot learning (TZSL) addresses this bias by including samples from unseen classes, it still faces obstacles in enhancing TZSL performance. In this study, We introduce the Structure Alignment Variational Autoencoder Generative Adversarial Network (SA-VAEGAN), a novel approach that enhances the alignment between visual and auxiliary spaces. We delved into the underlying causes of domain shift and introduced a structural alignment (SA) strategy to tackle these challenges. The SA model thoroughly accounts for both inter-class and intra-class dynamics, designed to leverage the model’s comprehension of high-level semantic relations to disambiguate confusion among similar classes and mitigate intra-class confusion by penalizing atypical visual samples within classes. Assessed across four benchmark datasets, SA-VAEGAN has established a new performance standard, underscoring its efficiency in addressing the domain shift challenge within TZSL tasks, and achieving high accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call