Multilingual Open Information Extraction: Challenges and Opportunities

Daniela Barreiro Claro,Clarissa Castellã Xavier,Marlo Souza,Leandro Oliveira

doi:10.3390/info10070228

Abstract

The number of documents published on the Web in languages other than English grows every year. As a consequence, the need to extract useful information from different languages increases, highlighting the importance of research into Open Information Extraction (OIE) techniques. Different OIE methods have dealt with features from a unique language; however, few approaches tackle multilingual aspects. In those approaches, multilingualism is restricted to processing text in different languages, rather than exploring cross-linguistic resources, which results in low precision due to the use of general rules. Multilingual methods have been applied to numerous problems in Natural Language Processing, achieving satisfactory results and demonstrating that knowledge acquisition for a language can be transferred to other languages to improve the quality of the facts extracted. We argue that a multilingual approach can enhance OIE methods as it is ideal to evaluate and compare OIE systems, and therefore can be applied to the collected facts. In this work, we discuss how the transfer knowledge between languages can increase acquisition from multilingual approaches. We provide a roadmap of the Multilingual Open IE area concerning state of the art studies. Additionally, we evaluate the transfer of knowledge to improve the quality of the facts extracted in each language. Moreover, we discuss the importance of a parallel corpus to evaluate and compare multilingual systems.

Highlights

Textual data are the main form of data published in the Web, and the number of published documents increases daily
We investigated approaches to the study of Multilingual Open Information Extraction
We presented a systematic mapping study to analyze the multilingual open information extraction area and performed initial experiments on the use of multilingual resources to improve the performance of Open Information Extraction (IE) systems

Summary

Introduction

Textual data are the main form of data published in the Web, and the number of published documents increases daily. As much as the Web is a valuable source of information and knowledge, the sheer amount of available pages renders it impossible for a person to explore all of the available information on any subject. It is of great importance to have methods for extracting useful information from texts. Information Extraction (IE), called Text Analysis, studies computational methods for identifying structured semantic information from unstructured sources such as documents or web pages. IE methods usually aim to identify semantic information expressed in natural languages, such as discursive entities and their relations, and store it in a standard, computational-friendly, representation for further usages, such as relational tuples

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Information	Publication Date: Jul 2, 2019
Citations: 12	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Multilingual Open Information Extraction: Challenges and Opportunities

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information

Lead the way for us

Similar Papers

Syntactic Representation Learning for Open Information Extraction on Web
Chengsen Ru ... Shasha Li
-
Chengsen Ru, et. al.Chengsen Ru ... Shasha Li
01 Jan 2017
01 Jan 2017

A Review of Open Information Extraction Techniques
Sally Ali ... Hamdy Mousa
IJCI. International Journal of Computers and Information | VOL. 6
Sally Ali, et. al.Sally Ali ... Hamdy Mousa
01 Jan 2019
IJCI. International Journal of Computers and Information | VOL. 6

Using Open Information Extraction to Extract Relations: An Extended Systematic Mapping
Vinicius Dos Santos ... Sandra M Aluisio
-
Vinicius Dos Santos, et. al.Vinicius Dos Santos ... Sandra M Aluisio
25 Oct 2021
25 Oct 2021

GRAPH-BASED METHODS FOR LANGUAGE PROCESSING AND INFORMATION RETRIEVAL
Dragomir Radev
-
Dragomir RadevDragomir Radev
01 Jan 2006
01 Jan 2006

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multilingual Open Information Extraction: Challenges and Opportunities

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information