Abstract

Zero-shot learning aims to recognize unseen categories by learning an embedding space between data samples and semantic representations. For the large-scale datasets with thousands of categories, embedding vectors of category labels are often used for semantic representation since it is difficult to define the semantic attributes of categories manually. Facing the problem of underutilization of prior knowledge during the construction of embedding vectors, this paper first constructs a novel knowledge graph as the supplement to the basic WordNet graph, and then proposes a fast hybrid model ARGCN-DKG, which means Attention based Residual Graph Convolutional Network on Different types of Knowledge Graphs. By introducing residual mechanism and attention mechanism, and integrating different knowledge graphs, the accuracy of knowledge transfer between different categories can be improved. Our model only use 2-layer GCN, the pretrained image features and category semantic features, so the training process could be done in minitues on single GPU, which could be one of the fastest training models for large-scale image recognition. Experiment results demonstrate that ARGCN-DKG model could get better results for large-scale datasets than the state-of-the-art model.

Highlights

  • In recent years, great progress has been made in image recognition methods based on supervised learning

  • In this paper, we first construct a novel knowledge graph as the supplement to the basic WordNet graph to speed up knowledge transfer, propose a hybrid model, ARGCN–DKG, for zero–shot image recognition based on different knowledge graphs

  • This shows that RGCN and ARGCN–DKG models are effective in knowledge transfer and have great potential

Read more

Summary

INTRODUCTION

Great progress has been made in image recognition methods based on supervised learning. It is difficult to define the semantic attributes of each category manually, so researchers usually use embedding vectors to represent the categories by training a large amount of text corpus Since these embedding vectors do not highlight the visual features from the prior knowledge, most of current algorithms are difficult to achieve good classification results for large–scale datasets [18], [19]. In this paper, we first construct a novel knowledge graph as the supplement to the basic WordNet graph to speed up knowledge transfer, propose a hybrid model, ARGCN–DKG, for zero–shot image recognition based on different knowledge graphs This model uses graph convolutional networks that fuse residuals and attention mechanisms on two kinds of knowledge graphs to generate more accurate semantic knowledge embedded representation of image categories.

A NEW KNOWLEDGE GRAPH
ATTENTION MODULE
TRAINING AND TESTING PROCESS
EXPERIMENT
TRAINING DETAILS
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call