A survey on visual transfer learning using knowledge graphs

Sebastian Monka,Lavdim Halilaj,Achim Rettinger

doi:10.3233/sw-212959

Abstract

The information perceived via visual observations of real-world phenomena is unstructured and complex. Computer vision (CV) is the field of research that attempts to make use of that information. Recent approaches of CV utilize deep learning (DL) methods as they perform quite well if training and testing domains follow the same underlying data distribution. However, it has been shown that minor variations in the images that occur when these methods are used in the real world can lead to unpredictable and catastrophic errors. Transfer learning is the area of machine learning that tries to prevent these errors. Especially, approaches that augment image data using auxiliary knowledge encoded in language embeddings or knowledge graphs (KGs) have achieved promising results in recent years. This survey focuses on visual transfer learning approaches using KGs, as we believe that KGs are well suited to store and represent any kind of auxiliary knowledge. KGs can represent auxiliary knowledge either in an underlying graph-structured schema or in a vector-based knowledge graph embedding. Intending to enable the reader to solve visual transfer learning problems with the help of specific KG-DL configurations we start with a description of relevant modeling structures of a KG of various expressions, such as directed labeled graphs, hypergraphs, and hyper-relational graphs. We explain the notion of feature extractor, while specifically referring to visual and semantic features. We provide a broad overview of knowledge graph embedding methods and describe several joint training objectives suitable to combine them with high dimensional visual embeddings. The main section introduces four different categories on how a KG can be combined with a DL pipeline: 1) Knowledge Graph as a Reviewer; 2) Knowledge Graph as a Trainee; 3) Knowledge Graph as a Trainer; and 4) Knowledge Graph as a Peer. To help researchers find meaningful evaluation benchmarks, we provide an overview of generic KGs and a set of image processing datasets and benchmarks that include various types of auxiliary knowledge. Last, we summarize related surveys and give an outlook about challenges and open issues for future research.

Highlights

Deep learning (DL) as a machine learning (ML) technique is broadly used to successfully solve computer vision (CV) tasks
A common method for training a deep neural network (DNN) is to minimize the cross-entropy (CE) loss, which is equivalent to maximizing the negative log-likelihood between the empirical distribution of the training set and the probability distribution defined by the model
Methods that belong to the category Knowledge Graph as a Trainer combine the visual output of a DNN with the auxiliary knowledge of a knowledge graphs (KGs) by learning a visual-semantic embedding hv,s

Summary

Introduction

Deep learning (DL) as a machine learning (ML) technique is broadly used to successfully solve computer vision (CV) tasks. A common method for training a deep neural network (DNN) is to minimize the cross-entropy (CE) loss, which is equivalent to maximizing the negative log-likelihood between the empirical distribution of the training set and the probability distribution defined by the model This relies on the independent and identically distributed (i.i.d.) assumptions as underlying rules of basic ML, which state that the examples in each dataset are independent of each other, that train and test set are identically distributed and drawn from the same probability distribution [47]. If the train and test domains follow different image distributions the i.i.d. assumptions are violated, and DL leads to unpredictable and poor results [131]. R Zero-shot learning is a visual transfer learning task with labeled source domain data and unlabeled target domain data. If zero-shot learning has access to an additional set of labeled target data XT , the task is called few-shot learning

Objectives

Methods

Findings

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Semantic Web	Publication Date: Apr 6, 2022
Citations: 10	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A survey on visual transfer learning using knowledge graphs

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Semantic Web

Lead the way for us

Similar Papers

Fine-Grained Evaluation of Knowledge Graph Embedding Models in Downstream Tasks
Yuxin Zhang ... Bohan Li
-
Yuxin Zhang, et. al.Yuxin Zhang ... Bohan Li
01 Jan 2020
01 Jan 2020

Fine-Grained Evaluation of Knowledge Graph Embedding Model in Knowledge Enhancement Downstream Tasks
Yuxin Zhang ... Han Yang
Big Data Research | VOL. 25
Yuxin Zhang, et. al.Yuxin Zhang ... Han Yang
02 Mar 2021
Big Data Research | VOL. 25

Relation-based multi-type aware knowledge graph embedding
Yingying Xue ... Kaixuan Wang
Neurocomputing | VOL. 456
Yingying Xue, et. al.Yingying Xue ... Kaixuan Wang
11 May 2021
Neurocomputing | VOL. 456

HyTE: Hyperplane-based Temporally aware Knowledge Graph Embedding
Shib Sankar Dasgupta ... Swayambhu Nath Ray
-
Shib Sankar Dasgupta, et. al.Shib Sankar Dasgupta ... Swayambhu Nath Ray
01 Jan 2018
01 Jan 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A survey on visual transfer learning using knowledge graphs

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Semantic Web