Randomised controlled trials (RCTs) are the gold standard for evaluating health interventions but often face ethical and practical challenges. When RCTs are not feasible, large observational data sets emerge as a pivotal resource, though these data sets may be subject to bias and unmeasured confounding. Traditional statistical (or non-causal) learning methods, while useful, face limitations in fully uncovering causal effects, i.e., determining if an intervention truly has a direct impact on the outcome. This gap is bridged by the latest advancements in causal inference methods, building upon machine learning-based approaches to investigate not only population-level effects but also the heterogeneous effects of interventions across population subgroups. We demonstrate a causality approach that utilises causal trees and forests, enhanced by weighting mechanisms to adjust for confounding covariates. This method does more than just predict the overall effect of an intervention on the whole population; it also gives a clear picture of how it works differently in various subgroups. Finally, this method excels in strategising and optimising interventions, by suggesting precise and explainable approaches to targeting the intervention, to maximise overall population health outcomes. These capabilities are crucial for health researchers, offering new insights into existing data and assisting in the decision-making process for future interventions. Using observational data from the 2017-18 Australian National Health Survey, our study demonstrates the power of causal trees in estimating the impact of exercise on BMI levels, understanding how this impact varies across subgroups, and assessing the effectiveness of various intervention targeting strategies for enhanced health benefits.
Read full abstract