The era of big textual corpora and machine learning technologies have paved the way for researchers in numerous data mining fields. Among them, causality mining (CM) from textual data has become a significant area of concern and has more attention from researchers. Causality (cause-effect relations) serves as an essential category of relationships, which plays a significant role in question answering, future events predication, discourse comprehension, decision making, future scenario generation, medical text mining, behavior prediction, and textual prediction entailment. While, decades of development techniques for CM are still prone to performance enhancement, especially for ambiguous and implicitly expressed causalities. The ineffectiveness of the early attempts is mainly due to small, ambiguous, heterogeneous, and domain-specific datasets constructed by manually linguistic and syntactic rules. Many researchers have deployed shallow machine learning (ML) and deep learning (DL) techniques to deal with such datasets, and they achieved satisfactory performance. In this survey, an effort has been made to address a comprehensive review of some state-of-the-art shallow ML and DL approaches in CM. We present a detailed taxonomy of CM and discuss popular ML and DL approaches with their comparative weaknesses and strengths, applications, popular datasets, and frameworks. Lastly, the future research challenges are discussed with illustrations of how to transform them into productive future research directions.
Read full abstract