Abstract
As an essential component of human cognition, cause–effect relations appear frequently in text, and curating cause–effect relations from text helps in building causal networks for predictive tasks. Existing causality extraction techniques include knowledge-based, statistical machine learning (ML)-based, and deep learning-based approaches. Each method has its advantages and weaknesses. For example, knowledge-based methods are understandable but require extensive manual domain knowledge and have poor cross-domain applicability. Statistical machine learning methods are more automated because of natural language processing (NLP) toolkits. However, feature engineering is labor-intensive, and toolkits may lead to error propagation. In the past few years, deep learning techniques attract substantial attention from NLP researchers because of its powerful representation learning ability and the rapid increase in computational resources. Their limitations include high computational costs and a lack of adequate annotated training data. In this paper, we conduct a comprehensive survey of causality extraction. We initially introduce primary forms existing in the causality extraction: explicit intra-sentential causality, implicit causality, and inter-sentential causality. Next, we list benchmark datasets and modeling assessment methods for causal relation extraction. Then, we present a structured overview of the three techniques with their representative systems. Lastly, we highlight existing open challenges with their potential directions.
Highlights
With the rapid growth of unstructured texts online, information extraction (IE) plays a vital role in natural language processing (NLP) research
Based on the assumption that dependency paths between cause and effect can be viewed as background knowledge, they use a wide range of such paths, regardless of whether cause and effect appear within one sentence or in adjacent sentences, taking web texts as extra input
Causal relations in natural language text play a key role in clinical decision-making, biomedical knowledge discovery, emergency management, news topic references, etc
Summary
With the rapid growth of unstructured texts online, information extraction (IE) plays a vital role in NLP research. RE refers to extracted and classified semantic relationships, such as whole–part, product–producer, and cause–effect from text. The critical issues of whether a disease is the reason for a symptom depend on if there are cause–effect relation between them Extracting such kinds of causal relations from the medical literature can support constructing a knowledge graph, which can assist doctors in quickly finding causality, like diseases-cause-symptoms, diseases-bring-complications, treatments-improveconditions, and customize treatment plans. The task of CE focuses on developing systems for identifying cause–effect relations between pairs of labeled nouns from text [5]. CE studies can be classified in terms of different representation patterns: explicit or implicit causality, intra- or inter-sentential causality. Causality in many texts is implicit and/or inter-sentential conditions, which are more complicated than basic kinds of causality.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.