Generalized Zero-Shot Learning Via Multi-Modal Aggregated Posterior Aligning Neural Network

Xingyu Chen,Nanning Zheng,Jin Li,Xuguang Lan

doi:10.1109/tmm.2020.3047546

Abstract

The visual-semantic gap between the visual space (visual features) and semantic space (semantic attributes) is one of the main problems in the Generalized Zero-Shot Learning (GZSL) task. The essence of this problem is that the structure of manifolds in these two spaces is inconsistent, which makes it difficult to learn embeddings that unify visual features and semantic attributes for similarity measurement. In this work, we tackle this problem by proposing a multi-modal aggregated posterior aligning neural network based on Wasserstein Auto-encoders (WAE) which learns a shared latent space for visual features and semantic attributes. The key to our approach is that the aggregated posterior distribution of the latent representations encoded from visual features of each class is encouraged to be aligned with a Gaussian distribution predicted by the corresponding semantic attribute in the latent space. On one hand, requiring the latent manifolds of visual features and semantic attributes to be consistent preserves the inter-class association between seen and unseen classes. On the other hand, the aggregated posterior of each class is directly defined as a Gaussian in the latent space, which provides a reliable way to synthesize latent features for training classification models. Using the AWA1, AWA2, CUB, aPY, FLO, and SUN benchmark datasets, we extensively conducted comparative evaluations to demonstrate the advantages of our method over state-of-the-art approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Generalized Zero-Shot Learning Via Multi-Modal Aggregated Posterior Aligning Neural Network

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia

Lead the way for us

Journal: IEEE Transactions on Multimedia	Publication Date: Dec 25, 2020
Citations: 16

Similar Papers

Residual-Prototype Generating Network for Generalized Zero-Shot Learning
Zeqing Zhang ... Weiwei Lin
Mathematics | VOL. 10
Zeqing Zhang, et. al.Zeqing Zhang ... Weiwei Lin
01 Oct 2022
Mathematics | VOL. 10

Explanatory Object Part Aggregation for Zero-Shot Learning.
Xin Chen ... Zhiquan Liu
IEEE transactions on pattern analysis and machine intelligence | VOL. 46
Xin Chen, et. al.Xin Chen ... Zhiquan Liu
01 Feb 2024
IEEE transactions on pattern analysis and machine intelligence | VOL. 46

Learning Discriminative Projection With Visual Semantic Alignment for Generalized Zero Shot Learning
Pengzhen Du ... Haofeng Zhang
IEEE Access | VOL. 8
Pengzhen Du, et. al.Pengzhen Du ... Haofeng Zhang
01 Jan 2020
IEEE Access | VOL. 8

Generative Mixup Networks for Zero-Shot Learning.
Bingrong Xu ... Zhengming Ding
IEEE Transactions on Neural Networks and Learning Systems | VOL. PP
Bingrong Xu, et. al.Bingrong Xu ... Zhengming Ding
01 Jan 2024
IEEE Transactions on Neural Networks and Learning Systems | VOL. PP

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Generalized Zero-Shot Learning Via Multi-Modal Aggregated Posterior Aligning Neural Network

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia