Abstract

Knowledge bases (KBs) are often greatly incomplete, necessitating a demand for KB completion. Although XLORE is an English-Chinese bilingual knowledge graph, there are only 423,974 cross-lingual links between English instances and Chinese instances. We present XLORE2, an extension of the XLORE that is built automatically from Wikipedia, Baidu Baike and Hudong Baike. We add more facts by making cross-lingual knowledge linking, cross-lingual property matching and fine-grained type inference. We also design an entity linking system to demonstrate the effectiveness and broad coverage of XLORE2.

Highlights

  • Wikipedia has become one of the most accessible online encyclopedias

  • XLORE2 reveals significantly more facts when compared with XLORE

  • In XLORE more than 18.7% instances are without useful type information, so our target is to identify the semantic type of an instance in XLORE2

Read more

Summary

Introduction

Wikipedia has become one of the most accessible online encyclopedias. It has extremely high language coverage, containing articles in 298 languages. The English version of Wikipedia owns more than 5.6 million articles, sitting in the first position. “Everyone can edit” makes its knowledge constantly increase and evolve. Knowledge in Wikipedia is in the form of free text or attribute-value pairs in infobox. Wikipedia’s vast knowledge inspires the emergence of many knowledge base (KB) projects that structure knowledge and link knowledge in different languages

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.