Abstract

The era of big textual corpora and machine learning technologies have paved the way for researchers in numerous data mining fields. Among them, causality mining (CM) from textual data has become a significant area of concern and has more attention from researchers. Causality (cause-effect relations) serves as an essential category of relationships, which plays a significant role in question answering, future events predication, discourse comprehension, decision making, future scenario generation, medical text mining, behavior prediction, and textual prediction entailment. While, decades of development techniques for CM are still prone to performance enhancement, especially for ambiguous and implicitly expressed causalities. The ineffectiveness of the early attempts is mainly due to small, ambiguous, heterogeneous, and domain-specific datasets constructed by manually linguistic and syntactic rules. Many researchers have deployed shallow machine learning (ML) and deep learning (DL) techniques to deal with such datasets, and they achieved satisfactory performance. In this survey, an effort has been made to address a comprehensive review of some state-of-the-art shallow ML and DL approaches in CM. We present a detailed taxonomy of CM and discuss popular ML and DL approaches with their comparative weaknesses and strengths, applications, popular datasets, and frameworks. Lastly, the future research challenges are discussed with illustrations of how to transform them into productive future research directions.

Highlights

  • Natural Language Processing (NLP) areas are termed computational linguistics, which includes designing computational systems and procedures to handle natural language problems in informative software platforms

  • We have cited over 100 popular papers from the above libraries and have shortlisted about 45 articles on causality mining (CM), which focuses on shallow machine learning (ML) only

  • To the best of our knowledge, this is the first survey paper, which focuses on widespread state-of-the-art ML and deep learning (DL) research techniques, algorithms, and frameworks spanning a few decades for CM

Read more

Summary

Introduction

Natural Language Processing (NLP) areas are termed computational linguistics, which includes designing computational systems and procedures to handle natural language problems in informative software platforms. Application fields focus on mining valuable relational information including Cause-Effect relation, Part-Whole relation, Product-Produce relation, Content-Container relation, If- relations, Translation of text among and between languages, Sentiment analysis, Summarization, Automatic question answering, document classification, and Clustering. For a few decades, automated knowledge extraction from text has been a challenging task because it deals with the relationship of syntax, semantics, vocabulary, metaphors, sarcasm, and ambiguous constructs like figurative expressions. In these cases, copying the human brain’s knowledge is an important task for understanding written texts that require developing a complicated model using ML and DL approaches. As well as the same is possible whenever the first event (the Effect) happens, the second event (the Cause) essentially or certainly follows

Objectives
Methods
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call