CrossOIE: Cross-Lingual Classifier for Open Information Extraction

Bruno Souza Cabral,Marlo Souza,Daniela Barreiro Claro,Rafael Glauber

doi:10.1007/978-3-030-41505-1_35

Abstract

Open information extraction (Open IE) is the task of extracting open-domain assertions from natural language sentences. Considering the low availability of datasets and tools for this task in languages other than English, recently it has been proposed that multilingual resources can be used to improve Open IE methods for different languages. In this work, we present the CrossOIE, a multilingual publicly available relation tuple validity classifier that scores Open IE systems’ extractions based on their estimated quality and can be used to improve Open IE systems and assist in the creation of Open IE benchmarks for different languages. Experiments show that our model trained using a small corpus in English, Spanish, and Portuguese can trade recall performance for up to 27% improvement in precision. This result was also archived in a zero-shot scenario, demonstrating a successful knowledge transfer across the languages.

Full Text