Improving Neural Knowledge Base Completion with Cross-Lingual Projections

Patrick Klein,Goran Glavaš,Simone Paolo Ponzetto

doi:10.18653/v1/e17-2083

Abstract

In this paper we present a cross-lingual extension of a neural tensor network model for knowledge base completion. We exploit multilingual synsets from BabelNet to translate English triples to other languages and then augment the reference knowledge base with cross-lingual triples. We project monolingual embeddings of different languages to a shared multilingual space and use them for network initialization (i.e., as initial concept embeddings). We then train the network with triples from the cross-lingually augmented knowledge base. Results on WordNet link prediction show that leveraging cross-lingual information yields significant gains over exploiting only monolingual triples.

Highlights

In the recent years we have witnessed an impressive amount of work on the automatic construction of wide-coverage Knowledge Bases (KBs), ranging from Web-scale machine reading systems like NELL (Carlson et al, 2010) all the way through large-scale ontologies like DBpedia (Bizer et al, 2009), YAGO (Hoffart et al, 2013), and BabelNet (Navigli and Ponzetto, 2012b) as a multi-lingual KB covering a wide range of languages
We presented a cross-lingual extension of the NTNKBC model of Socher et al (2013) that leverages a multilingual knowledge graph and multilingual embedding space
Our results indicate that using cross-lingual links between entity lexicalizations in different languages yields better NTNKBC model

Summary

Introduction

In the recent years we have witnessed an impressive amount of work on the automatic construction of wide-coverage Knowledge Bases (KBs), ranging from Web-scale machine reading systems like NELL (Carlson et al, 2010) all the way through large-scale ontologies like DBpedia (Bizer et al, 2009), YAGO (Hoffart et al, 2013), and BabelNet (Navigli and Ponzetto, 2012b) as a multi-lingual KB covering a wide range of languages. Neural models have recently been ubiquitously applied to various NLP tasks, and knowledge base completion (KBC) is no exception (Bordes et al, 2011; Jenatton et al, 2012; Bordes et al, 2013; Socher et al, 2013; Wang et al, 2014; Yang et al, 2015) These models represent KB concepts and relations as vectors, matrices, and most expressive of them, like that of Socher et al (2013), as three-dimensional tensors. We believe that a shared multilingual embedding space and cross-lingual knowledge links provide a form of additional regularization for the neural tensor network model and allow for better generalization, yielding significant link prediction improvements

Related Work

Cross-Lingual Information for Knowledge Base Completion

Evaluation

Experimental Setting

Results and Discussion

Conclusion