Abstract

Zero-shot learning (ZSL) tackles the unseen class recognition problem by transferring semantic knowledge from seen classes to unseen ones. Typically, to guarantee desirable knowledge transfer, a direct embedding is adopted for associating the visual and semantic domains in ZSL. However, most existing ZSL methods focus on learning the embedding from implicit global features or image regions to the semantic space. Thus, they fail to: 1) exploit the appearance relationship priors between various local regions in a single image, which corresponds to the semantic information and 2) learn cooperative global and local features jointly for discriminative feature representations. In this article, we propose the novel graph navigated dual attention network (GNDAN) for ZSL to address these drawbacks. GNDAN employs a region-guided attention network (RAN) and a region-guided graph attention network (RGAT) to jointly learn a discriminative local embedding and incorporate global context for exploiting explicit global embeddings under the guidance of a graph. Specifically, RAN uses soft spatial attention to discover discriminative regions for generating local embeddings. Meanwhile, RGAT employs an attribute-based attention to obtain attribute-based region features, where each attribute focuses on the most relevant image regions. Motivated by the graph neural network (GNN), which is beneficial for structural relationship representations, RGAT further leverages a graph attention network to exploit the relationships between the attribute-based region features for explicit global embedding representations. Based on the self-calibration mechanism, the joint visual embedding learned is matched with the semantic embedding to form the final prediction. Extensive experiments on three benchmark datasets demonstrate that the proposed GNDAN achieves superior performances to the state-of-the-art methods. Our code and trained models are available at https://github.com/shiming-chen/GNDAN.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call