Abstract
The era of big textual corpora and machine learning technologies have paved the way for researchers in numerous data mining fields. Among them, causality mining (CM) from textual data has become a significant area of concern and has more attention from researchers. Causality (cause-effect relations) serves as an essential category of relationships, which plays a significant role in question answering, future events predication, discourse comprehension, decision making, future scenario generation, medical text mining, behavior prediction, and textual prediction entailment. While, decades of development techniques for CM are still prone to performance enhancement, especially for ambiguous and implicitly expressed causalities. The ineffectiveness of the early attempts is mainly due to small, ambiguous, heterogeneous, and domain-specific datasets constructed by manually linguistic and syntactic rules. Many researchers have deployed shallow machine learning (ML) and deep learning (DL) techniques to deal with such datasets, and they achieved satisfactory performance. In this survey, an effort has been made to address a comprehensive review of some state-of-the-art shallow ML and DL approaches in CM. We present a detailed taxonomy of CM and discuss popular ML and DL approaches with their comparative weaknesses and strengths, applications, popular datasets, and frameworks. Lastly, the future research challenges are discussed with illustrations of how to transform them into productive future research directions.
Highlights
Natural Language Processing (NLP) areas are termed computational linguistics, which includes designing computational systems and procedures to handle natural language problems in informative software platforms
We have cited over 100 popular papers from the above libraries and have shortlisted about 45 articles on causality mining (CM), which focuses on shallow machine learning (ML) only
To the best of our knowledge, this is the first survey paper, which focuses on widespread state-of-the-art ML and deep learning (DL) research techniques, algorithms, and frameworks spanning a few decades for CM
Summary
Natural Language Processing (NLP) areas are termed computational linguistics, which includes designing computational systems and procedures to handle natural language problems in informative software platforms. Application fields focus on mining valuable relational information including Cause-Effect relation, Part-Whole relation, Product-Produce relation, Content-Container relation, If- relations, Translation of text among and between languages, Sentiment analysis, Summarization, Automatic question answering, document classification, and Clustering. For a few decades, automated knowledge extraction from text has been a challenging task because it deals with the relationship of syntax, semantics, vocabulary, metaphors, sarcasm, and ambiguous constructs like figurative expressions. In these cases, copying the human brain’s knowledge is an important task for understanding written texts that require developing a complicated model using ML and DL approaches. As well as the same is possible whenever the first event (the Effect) happens, the second event (the Cause) essentially or certainly follows
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.