Abstract
Embedding learning on knowledge graphs (KGs) aims to encode all entities and relationships into a continuous vector space, which provides an effective and flexible method to implement downstream knowledge-driven artificial intelligence (AI) and natural language processing (NLP) tasks. Since KG construction usually involves automatic mechanisms with less human supervision, it inevitably brings in plenty of noises to KGs. However, most conventional KG embedding approaches inappropriately assume that all facts in existing KGs are completely correct and ignore noise issues, which brings about potentially serious errors. To address this issue, in this paper we propose a novel approach to learn embeddings with triple trustiness on KGs, which takes possible noises into consideration. Specifically, we calculate the trustiness value of triples according to the rich and relatively reliable information from large amounts of entity type instances and entity descriptions in KGs. In addition, we present a cross-entropy based loss function for model optimization. In experiments, we evaluate our models on KG noise detection, KG completion and classification. Through extensive experiments on three datasets, we demonstrate that our proposed model can learn better embeddings than all baselines on noisy KGs.
Highlights
Knowledge graphs (KGs) provide effective well-structured relational information between entities.A typical KG usually consists of a huge amount of knowledge triples in the form of (denoted (h, r, t)), e.g., (Barack Obama, was_born_in, Hawaii)
We propose two sub-models for calculating triple trustiness, one of which is estimated on newly generated entity type triples and another is measured with synthetic entity description triples
(2) Our methods achieve more significant improvement as the noise rate increases, compared with basic mode translating method (TransE) between the three noisy datasets. It verifies that considering the trustiness in noisy KG embedding is very essential especially when KGs have a high rate of noises
Summary
Knowledge graphs (KGs) provide effective well-structured relational information between entities.A typical KG usually consists of a huge amount of knowledge triples in the form of (head entity, relationship, tail entity) (denoted (h, r, t)), e.g., (Barack Obama, was_born_in, Hawaii). Knowledge graphs (KGs) provide effective well-structured relational information between entities. KG embedding aims at learning embeddings of all entities and relationships, which usually are used to promote down-stream knowledge-driven artificial intelligence (AI) and natural language processing (NLP) tasks, such as human-like reasoning, semantic parsing [1], question answering [2,3], relation extraction [4,5], speech generation [6], etc. The past decade has witnessed great surge in building web-scale KGs, such as Freebase [7], WordNet [8], YAGO [9], DBpedia [10], Google Knowledge Graph [11], and other domain-specific KGs. Recently, open information extraction (Open IE) [12], automatic neural relation extraction [13] and crowd-sourcing mechanism are widely used for KG construction, while these approaches inevitably bring noises in KG due to insufficient human supervision [14,15].
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have