Abstract

Compared with zero-shot learning (ZSL), the generalized zero-shot learning (GZSL) is more challenging since its test samples are taken from both seen and unseen classes. Most previous mapping-based methods perform well on ZSL, while their performance degrades on GZSL. To solve this problem, inspired by the ensemble learning, this paper proposes a model with cooperative coupled generative networks (CCGN). Firstly, to alleviate the hubness problem, the reverse visual feature space is taken as the embedding space, with the mapping achieved by a visual feature center generation network. To learn a proper visual representation of each class, we propose a coupled of generative networks, which cooperate with each other to synthesize a visual feature center template of the class. Secondly, to improve the generative ability of the coupled networks, we further employ a deeper network to generate. Meanwhile, to alleviate loss semantic information problem caused by multiple network layers, a residual module is employed. Thirdly, to mitigate overfitting and to increase scalability, an adversarial network is introduced to discriminate the generation of visual feature centers. Finally, a reconstruction network, which reverses the generation process, is employed to restrict the structural correlation between the generated visual feature center and the original semantic representation of each class. Extensive experiments on five benchmark datasets (AWA1, AWA2, CUB, SUN, APY) demonstrate that the proposed algorithm yields satisfactory results, as compared with the state-of-the-art methods.

Highlights

  • Due to the availability of big data and the development of artificial neural networks, supervised learning methods have achieved great success [1]

  • Inspired by human’s ability of identifying new objects, researchers have pay attention to study on zeroshot learning (ZSL) [3]–[5], which is referred as identifying unseen classes by utilizing the auxiliary information included in the seen classes [6]–[8]

  • A more challenging generalized zero-shot learning (GZSL) [9], [10] problem attracts the attentions of researchers

Read more

Summary

INTRODUCTION

Due to the availability of big data and the development of artificial neural networks, supervised learning methods have achieved great success [1]. DEM [14], a competitive mapping-based method, learns a deep mapping model from semantic attribute template to a visual feature center. The synthesis-based methods exploit the correlations between semantic attributes, and the generate-based methods generally fully utilize the powerful generation abilities of generative adversarial networks The objective of both methods is to synthesize or generate samples of unseen classes and transforms GZSL into traditional classification problems. G1 and G2 cooperate with each other to keep the stability of the model Their outputs are fused by balancing the ratio of the coefficients of the features generated, resulting in a template of visual feature center for a given class with which the unseen samples are classified. 1 for each i ∈ [1, T ] do 2 Fix the G2, D and R, train the G1 using Eq(2); 3 Fix the G1, G2 and R, train the D using Eq(3); 4 Fix the D, G1 and R, train the G2 using Eq(4); 5 Fix the G1, G2 and D, train the R using Eq(5); 6 end 7 Return G1 and G2 parameters

EXPERIMENTAL STUDIES
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.