Learning Discriminative Projection With Visual Semantic Alignment for Generalized Zero Shot Learning

Pengzhen Du,Haofeng Zhang,Jianfeng Lu

doi:10.1109/access.2020.3022805

Pengzhen Du, Haofeng Zhang + Show 1 more

Open Access

https://doi.org/10.1109/access.2020.3022805

Copy DOI

Abstract

Zero Shot Learning (ZSL) aims to solve the classification problem with no training sample, and it is realized by transferring knowledge from source classes to target classes through the semantic embeddings bridging. Generalized ZSL (GZSL) enlarges the search scope of ZSL from only the seen classes to all classes. A large number of methods are proposed for these two settings, and achieve competing performance. However, most of them still suffer from the domain shift problem due to the existence of the domain gap between the seen classes and unseen classes. In this article, we propose a novel method to learn discriminative features with visual-semantic alignment for GZSL. We define a latent space, where the visual features and semantic attributes are aligned, and assume that each prototype is the linear combination of others, where the coefficients are constrained to be the same in all three spaces. To make the latent space more discriminative, a linear discriminative analysis strategy is employed to learn the projection matrix from visual space to latent space. Five popular datasets are exploited to evaluate the proposed method, and the results demonstrate the superiority of our approach compared with the state-of-the-art methods. Beside, extensive ablation studies also show the effectiveness of each module in our method.

Highlights

With the development of deep learning technique, the task of image classification has been transfered to large scale datasets, such as ImageNet [1], and achieved the level of human-beings [2]
The contributions of our method is summarized as follows, 1) We proposed a novel method to solve the domain shift problem by learning discriminative projections with visual semantic alignment in latent space; 2) A linear discriminative analysis strategy is employed to learn the projection from visual space to latent space, which can make the projected features in the latent space more discriminative; 3) We assume that each prototype in all three spaces, including visual, latent and semantic, is a linear sparse combination of other prototypes, and the sparse coefficients for all three spaces are the same
This strategy can establish a link between seen classes and unseen classes, reduce the domain gap between them and eventually solve the domain shift problem; 4) Extensive experiments are conducted on five popular datasets, and the result shows the superiority of our method

Summary

INTRODUCTION

With the development of deep learning technique, the task of image classification has been transfered to large scale datasets, such as ImageNet [1], and achieved the level of human-beings [2]. The contributions of our method is summarized as follows, 1) We proposed a novel method to solve the domain shift problem by learning discriminative projections with visual semantic alignment in latent space; 2) A linear discriminative analysis strategy is employed to learn the projection from visual space to latent space, which can make the projected features in the latent space more discriminative; 3) We assume that each prototype in all three spaces, including visual, latent and semantic, is a linear sparse combination of other prototypes, and the sparse coefficients for all three spaces are the same This strategy can establish a link between seen classes and unseen classes, reduce the domain gap between them and eventually solve the domain shift problem; 4) Extensive experiments are conducted on five popular datasets, and the result shows the superiority of our method.

RELATED WORKS

PROBLEM DEFINITION

DATASETS

COMPARISON WITH BASELINES

ABLATION STUDY

ZERO SHOT IMAGE RETRIEVAL

Findings

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Learning Discriminative Projection With Visual Semantic Alignment for Generalized Zero Shot Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Journal: IEEE Access	Publication Date: Jan 1, 2020
License type: CC BY 4.0

Similar Papers

Residual-Prototype Generating Network for Generalized Zero-Shot Learning
Zeqing Zhang ... Weiwei Lin
Mathematics | VOL. 10
Zeqing Zhang, et. al.Zeqing Zhang ... Weiwei Lin
01 Oct 2022
Mathematics | VOL. 10

Generative Mixup Networks for Zero-Shot Learning.
Bingrong Xu ... Cheng Lian
IEEE Transactions on Neural Networks and Learning Systems | VOL. PP
Bingrong Xu, et. al.Bingrong Xu ... Cheng Lian
01 Jan 2024
IEEE Transactions on Neural Networks and Learning Systems | VOL. PP

Learning discriminative domain-invariant prototypes for generalized zero shot learning
Yinduo Wang ... Ling Shao
Knowledge-Based Systems | VOL. 196
Yinduo Wang, et. al.Yinduo Wang ... Ling Shao
24 Mar 2020
Knowledge-Based Systems | VOL. 196

Generalized Zero-Shot Learning Via Multi-Modal Aggregated Posterior Aligning Neural Network
Xingyu Chen ... Xuguang Lan
IEEE Transactions on Multimedia | VOL. 24
Xingyu Chen, et. al.Xingyu Chen ... Xuguang Lan
25 Dec 2020
IEEE Transactions on Multimedia | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning Discriminative Projection With Visual Semantic Alignment for Generalized Zero Shot Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access