Abstract
Knowledge graphs (KGs) are collections of real-world knowledge that is represented by a structured form of triples. Since they are manually built in their nascent stage, there is a common problem that some links (triples) are missing. Knowledge graph completion (KGC) aims to find those missing links and thereby complete the KGs. However, as knowledge increases through diverse sources, new entities have explosively emerged and they are needed to be connected to existing KGs. Thus, open-world KGC is targeted on extending KGs to those new entities. Dealing with those new entities is challenging because they do not have any connection with entities in the existing KGs. One way to handle the new ones is to embed them with their textual descriptions with pre-trained word embeddings and score them in the graph-vector space with the existing typical KGC models. These models have resulted in meaningful results but there is still a lack of studies on utilizing the latest neural networks, such as pre-trained language models which are known to be better at capturing contexts than pre-trained word embeddings. This paper proposes a novel model that effectively connects new entities and existing KGs through a pre-trained language model. To effectively handle the problem, we utilize two learning methods; one is the classification method of the masked language model (MLM) that predicts a word among a huge vocabulary set with a given context, and the other is multi-task learning based on the Multi-Task for Deep Neural Networks (MT-DNN). Based on the methods, the model first generates an embedding of a new entity using its textual description and then uses the embedding to find one of the existing entities from a KG where the new entity can be connected. The experimental results on three benchmark datasets, DBPedia50k, FB15k-237-OWE, and FB20k, show that the proposed model improves performances by 9.2%p, 4.4%p, and 11.1%p, respectively, and achieves new state-of-the-art performance for all datasets.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.