Abstract

Recently, different systems which learn to populate and extend a knowledge base (KB) from the web in different languages have been presented. Although a large set of concepts should be learnt independently from the language used to read, there are facts which are expected to be more easily gathered in local language (e.g., culture or geography). A system that merges KBs learnt in different languages will benefit from the complementary information as long as common beliefs are identified, as well as from redundancy present in web pages written in different languages. In this paper, we deal with the problem of identifying equivalent beliefs (or concepts) across language specific KBs, assuming that they share the same ontology of categories and relations. In a case study with two KBs independently learnt from different inputs, namely web pages written in English and web pages written in Portuguese respectively, we report on the results of two methodologies: an approach based on personalized PageRank and an inference technique to find out common relevant paths through the KBs. The proposed inference technique efficiently identifies relevant paths, outperforming the baseline (a dictionary-based classifier) in the vast majority of tested categories.

Highlights

  • In the last few decades, the machine learning community has launched different research projects to take advantage of the massive source of information which has become the web, and of the people who build it up

  • In this paper we deal with the problem of merging knowledge base (KB) learnt in different languages

  • We found out that many entities in both KBs are isolated, i.e., they have no relationship

Read more

Summary

Introduction

In the last few decades, the machine learning community has launched different research projects to take advantage of the massive source of information which has become the web, and of the people who build it up. Information extraction systems (IES) which use the text found in webpages to extract, validate and incorporate beliefs to a structured knowledge base have been developed (e.g., YAGO (Suchanek et al, 2008), NELL (Mitchell et al, 2015) or Knowledge Vault (Dong et al, 2014)). Such knowledge bases (KBs) store facts about the real world, which are represented as entities and relationship among entities.

Methods
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.