Abstract

In the absence of unseen training data, zero-shot learning algorithms utilize the semantic knowledge shared by the seen and unseen classes to establish the connection between the visual space and the semantic space, so as to realize the recognition of the unseen classes. However, in real applications, the original semantic representation cannot well characterize both the class-specificity structure and discriminative information in dimension space, which leads to unseen classes being easily misclassified into seen classes. To tackle this problem, we propose a Salient Attributes Learning Network (SALN) to generate discriminative and expressive semantic representation under the supervision of the visual features. Meanwhile, ℓ1,2-norm constraint is employed to make the learned semantic representation well characterize the class-specificity structure and discriminative information in dimension space. Then feature alignment network projects the learned semantic representation into visual space and a relation network is adopted for classification. The performance of the proposed approach has made progress on the five benchmark datasets in generalized zero-shot learning task, and in-depth experiments indicate the effectiveness and excellence of our method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call