Generalized Zero-Shot Video Classification via Generative Adversarial Networks

Mingyao Hong,Qingming Huang,Xinfeng Zhang,Guorong Li

doi:10.1145/3394171.3413517

Mingyao Hong, Qingming Huang + Show 2 more

https://doi.org/10.1145/3394171.3413517

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Zero-shot learning (ZSL) is to classify images according to detailed attribute annotations into new categories that are unseen during the training stage. Generalized zero-shot learning (GZSL) adds seen categories to the test samples. Since the learned classifier has inherent bias against seen categories, GZSL is more challenging than traditional ZSL. However, at present, there is no detailed attribute description dataset for video classification. Therefore, the current zero-shot video classification problem is based on the synthesis of generative adversarial networks trained on seen-class features into unseen-class features for ZSL classification. In order to solve this problem, we propose a description text dataset based on the UCF101 action recognition dataset. To the best of our knowledge, this is the first work to add description of the classes to zero-shot video classification. We propose a new loss function that combines visual features with textual features. We extract text features from the proposed text data set, and constrain the process of generating synthetic features based on the principle that videos with similar text types should be similar. Our method reapplies the traditional zero-shot learning idea to video classification. From the experimental point of view, our proposed dataset and method have a positive impact on the generalized zero-shot video classification.

Full Text