Abstract
Knowledge graphs are used as a source of prior knowledge in numerous computer vision tasks. However, such an approach requires to have a mapping between ground truth data labels and the target knowledge graph. We linked the ILSVRC 2012 dataset (often simply referred to as ImageNet) labels to Wikidata entities. This enables using rich knowledge graph structure and contextual information for several computer vision tasks, traditionally benchmarked with ImageNet and its variations. For instance, in few-shot learning classification scenarios with neural networks, this mapping can be leveraged for weight initialisation, which can improve the final performance metrics value. We mapped all 1000 ImageNet labels – 461 were already directly linked with the exact match property (P2888), 467 have exact match candidates, and 72 cannot be matched directly. For these 72 labels, we discuss different problem categories stemming from the inability of finding an exact match. Semantically close non-exact match candidates are presented as well. The mapping is publicly available athttps://github.com/DominikFilipiak/imagenet-to-wikidata-mapping.
Highlights
Thanks to deep learning and convolutional neural networks, the field of computer vision experienced rapid development in recent years
We provide a mapping between ImageNet classes and Wikidata entities, as this is the first step to achieve this goal
We provide a concise analysis of the number of direct properties, which is a crucial feature in spite of the future usage of the mapping in various computer vision settings
Summary
Thanks to deep learning and convolutional neural networks, the field of computer vision experienced rapid development in recent years. Our publicly available mapping links WordNet synset used as ImageNet labels with Wikidata entities. It will be useful for the aforementioned computer vision tasks. Practical usage scenarios consider situations in which labelling data is a costly process and the considered classes can be linked to a given graph (that is, for few- or zero-shot learning tasks). Simpler tasks, such as classification, can use context knowledge stemming from rich knowledge graph structure (in prototype learning [18], for instance).
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have