An adaptable, high-performance relation extraction system for complex sentences

Anu Thomas,Sangeetha Sivanesan

doi:10.1016/j.knosys.2022.108956

Abstract

The rapid proliferation of text data has lead to an increase in the use of Information Extraction (IE) techniques to automatically extract key information in a fast and effective manner. Relation Extraction (RE), a sub-task of IE focuses on extracting semantic relations from free natural language text and is crucial for further applications including Question Answering, Information Retrieval, Knowledge Base construction, Text Summarization, etc. Literature shows that supervised learning approaches were widely used in RE. However, the performance of supervised methodologies depend on the availability of domain-specific annotated datasets which is not viable for many of the domains including legal, financial, insurance etc. In recent times, Open Information Extraction (OIE) techniques address this issue, by facilitating domain-independent extraction of relations from large text corpora with no demand for domain-specific tagged data and predefined relation classes. Even though OIE systems are fast and simple to implement, they are less effective in handling complex sentences, and often produce redundant extractions.This paper proposes an efficient RE system to extract domain-specific relations from natural language text, consisting of Knowledge-based and Semi-supervised learning systems, integrated with domain ontology. We evaluated the performance of proposed work on ‘judicial domain” as a use case and found that it overcomes the flaws and limitations of existing RE approaches, by achieving better results in terms of precision and recall. On further analysis, we found that the proposed system outperforms existing cutting-edge OIE systems on varying sentence length and complexity.

Full Text