Language Model Guided Knowledge Graph Embeddings

Mirza Mohtashim Alam,Mojtaba Nayyeri,Md Rashad Al Hasan Rony,Sahar Vahdati,Jens Lehmann,Karishma Mohiuddin,M S T Mahfuja Akter

doi:10.1109/access.2022.3191666

Abstract

Knowledge graph embedding models have become a popular approach for knowledge graph completion through predicting the plausibility of (potential) triples. This is performed by transforming the entities and relations of the knowledge graph into an embedding space. However, knowledge graphs often include further textual information stored in literal, which is ignored by such embedding models. As a consequence, the learning process stays limited to the structure and the connections between the entities, which has the potential to negatively influence the performance. We bridge this gap by leveraging the capabilities of pre-trained language models to include textual knowledge in the learning process of embedding models. This is achieved by introducing a new loss function that guides embedding models in measuring the likelihood of triples by taking such complementary knowledge into consideration. The proposed solution is a model-independent loss function that can be plugged into any knowledge graph embedding model. In this paper, Sentence-BERT and fastText are used as pre-trained language models from which the embeddings of the textual knowledge are obtained and injected into the loss function. The loss function contains a trainable slack variable that determines the degree to which the language models influence the plausibility of triples. Our experimental evaluation on six benchmarks, namely Nations, UMLS, WordNet, and three versions of CodEx confirms the advantage of using pre-trained language models for boosting the accuracy of knowledge graph embedding models. We showcase this by performing evaluations on top of the five well-known knowledge graph embedding models such as TransE, RotatE, ComplEx, DistMult, and QuatE. The results show an improvement in accuracy up to 9% on UMLS dataset for the Distmult model and 4.2% on the Nations dataset for the ComplEx model when they are guided by pre-trained language models. We additionally studied the effect of multiple factors such as the structure of the knowledge graphs and training steps and presented them as ablation studies.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2022
Citations: 4	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Language Model Guided Knowledge Graph Embeddings

Abstract

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Fine-Grained Evaluation of Knowledge Graph Embedding Models in Downstream Tasks
Yuxin Zhang ... Bohan Li
-
Yuxin Zhang, et. al.Yuxin Zhang ... Bohan Li
01 Jan 2020
01 Jan 2020

Fine-Grained Evaluation of Knowledge Graph Embedding Model in Knowledge Enhancement Downstream Tasks
Yuxin Zhang ... Han Yang
Big Data Research | VOL. 25
Yuxin Zhang, et. al.Yuxin Zhang ... Han Yang
02 Mar 2021
Big Data Research | VOL. 25

Multiple Run Ensemble Learning with Low-Dimensional Knowledge Graph Embeddings
Chengjin Xu ... Jens Lehmann
-
Chengjin Xu, et. al.Chengjin Xu ... Jens Lehmann
18 Jul 2021
18 Jul 2021

Rule-based data augmentation for knowledge graph embedding
Guangyao Li ... Wei Hu
AI Open | VOL. 2
Guangyao Li, et. al.Guangyao Li ... Wei Hu
01 Jan 2020
AI Open | VOL. 2

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Language Model Guided Knowledge Graph Embeddings

Abstract

Talk to us

Similar Papers

More From: IEEE Access