Optimal Causal Decision Trees Ensemble for Improved Prediction and Causal Inference

Neelam Younas,Hafsa Hina,Zardad Khan,Saeed Aldahmani,Amjad Ali,Muhammad Hamraz

doi:10.1109/access.2022.3146406

Abstract

Ensemble methods can be used to identify causal relationships in data for a better understanding and taking the right decision in processes that involve high risk. This paper explores the idea of a causal decision tree forest and proposes a regularized ensemble method by integrating optimal causal trees for improved prediction accuracy while not compromising on accurately estimating heterogeneous treatment effects. The proposed method is based on selecting a subset of the most accurate causal trees from a sufficiently large pool based on their out-of-sample error estimates. The selected trees are integrated to form an ensemble that is used for estimating heterogeneous treatment effect and predicting unseen data. The proposed method is applied on Pakistan’s income function consisting of 27964 observations on wages of workers age 10 and above as an example dataset. The paper gives a detailed simulation study where datasets are generated under 5 different designs. The proposed method is assessed against ordinary least square (OLS), least absolute shrinkage and selection operator (LASSO), Ridge, Causal Tree and the standard decision trees forest (i.e. the causal forest) via mean square error ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">MSE</i> ), root mean square error ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">RMSE</i> ), mean absolute deviation ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">MAD</i> ) and Pearson correlation ( <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${r}$ </tex-math></inline-formula> ) as performance metrics. The analyses given in the paper reveal that the proposed method can be used effectively for estimating heterogeneous treatment effects and achieves better prediction performance and as compared to the rest of the methods given in the paper.

Highlights

The identification of the causal relationships in the data is key to provide a better understanding and the knowledge for taking an accurate decision in processes with risk
EXPERIMENTS AND RESULTS In this paper, the proposed optimal causal trees ensemble (OCTE) is assessed using five different simulation scenarios. It is compared with five state-of-the-art methods, i.e., ordinary least square (OLS), least absolute shrinkage and selection operator (LASSO), Ridge, causal tree and causal random forest
The OCTE is applied on a real dataset, the nationally representative Labor Force Survey of Pakistan (LFSP)

Summary

Introduction

The identification of the causal relationships in the data is key to provide a better understanding and the knowledge for taking an accurate decision in processes with risk. Sometimes, the purpose of using machine learning methods could potentially exceed prediction, such as representing and discovering causal relationships in data and estimating heterogeneous causal effects. This kind of application provides a compact and precise graphical representation of the causal relationships between a set of predictor attributes and an outcome attribute. Typical examples include classification and regression trees, k-nearest neighbours models, support vector machines, etc

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2022
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Optimal Causal Decision Trees Ensemble for Improved Prediction and Causal Inference

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Forecasting Completed Cost of Highway Construction Projects Using LASSO Regularized Regression
Yuanxin Zhang ... Duzgun Agdas
Journal of Construction Engineering and Management | VOL. 143
Yuanxin Zhang, et. al.Yuanxin Zhang ... Duzgun Agdas
21 Jul 2017
Journal of Construction Engineering and Management | VOL. 143

Predicting Patient Survival from Microarray Data by Accelerated Failure Time Modeling Using Partial Least Squares and LASSO
Susmita Datta ... Somnath Datta
Biometrics | VOL. 63
Susmita Datta, et. al.Susmita Datta ... Somnath Datta
01 Mar 2007
Biometrics | VOL. 63

PEMODELAN PERTUMBUHAN EKONOMI JAWA TENGAH MENGGUNAKAN PENDEKATAN LEAST ABSOLUTE SHRINKAGE AND SELECTION OPERATOR (LASSO)
...
-
, et. al. ...
30 Oct 2015
30 Oct 2015

Leveraging machine learning for predicting human body model response in restraint design simulations
Hamed Joodaki ... Jason Kerrigan
Computer Methods in Biomechanics and Biomedical Engineering | VOL. 24
Hamed Joodaki, et. al.Hamed Joodaki ... Jason Kerrigan
12 Nov 2020
Computer Methods in Biomechanics and Biomedical Engineering | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Optimal Causal Decision Trees Ensemble for Improved Prediction and Causal Inference

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access