Leveraging Dual Variational Autoencoders and Generative Adversarial Networks for Enhanced Multimodal Interaction in Zero-Shot Learning

Ning Li,Wenzhuo Xiao,Jie Chen,Chunming Gao,Tianrun Ye,Nanxin Fu,Ping Zhang

doi:10.3390/electronics13030539

Abstract

In the evolving field of taxonomic classification, and especially in Zero-shot Learning (ZSL), the challenge of accurately classifying entities unseen in training datasets remains a significant hurdle. Although the existing literature is rich in developments, it often falls short in two critical areas: semantic consistency (ensuring classifications align with true meanings) and the effective handling of dataset diversity biases. These gaps have created a need for a more robust approach that can navigate both with greater efficacy. This paper introduces an innovative integration of transformer models with ariational autoencoders (VAEs) and generative adversarial networks (GANs), with the aim of addressing them within the ZSL framework. The choice of VAE-GAN is driven by their complementary strengths: VAEs are proficient in providing a richer representation of data patterns, and GANs are able to generate data that is diverse yet representative, thus mitigating biases from dataset diversity. Transformers are employed to further enhance semantic consistency, which is key because many existing models underperform. Through experiments have been conducted on benchmark ZSL datasets such as CUB, SUN, and Animals with Attributes 2 (AWA2), our approach is novel because it demonstrates significant improvements, not only in enhancing semantic and structural coherence, but also in effectively addressing dataset biases. This leads to a notable enhancement of the model’s ability to generalize visual categorization tasks beyond the training data, thus filling a critical gap in the current ZSL research landscape.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronics	Publication Date: Jan 29, 2024
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Leveraging Dual Variational Autoencoders and Generative Adversarial Networks for Enhanced Multimodal Interaction in Zero-Shot Learning

Abstract

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

OntoZSL: Ontology-enhanced Zero-shot Learning
Yuxia Geng ... Huajun Chen
-
Yuxia Geng, et. al.Yuxia Geng ... Huajun Chen
19 Apr 2021
19 Apr 2021

Dual-stream generative adversarial networks for distributionally robust zero-shot learning
Huan Liu ... Yanzhang Lyu
Information Sciences | VOL. 519
Huan Liu, et. al.Huan Liu ... Yanzhang Lyu
20 Jan 2020
Information Sciences | VOL. 519

Alleviating Domain Shift via Discriminative Learning for Generalized Zero-Shot Learning
Yalan Ye ... Yukun He
IEEE Transactions on Multimedia | VOL. 24
Yalan Ye, et. al.Yalan Ye ... Yukun He
08 Mar 2021
IEEE Transactions on Multimedia | VOL. 24

Learning Feature-to-Feature Translator by Alternating Back-Propagation for Generative Zero-Shot Learning
Yizhe Zhu ... Jianwen Xie
-
Yizhe Zhu, et. al.Yizhe Zhu ... Jianwen Xie
01 Oct 2019
01 Oct 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Leveraging Dual Variational Autoencoders and Generative Adversarial Networks for Enhanced Multimodal Interaction in Zero-Shot Learning

Abstract

Talk to us

Similar Papers

More From: Electronics