Abstract

In population health surveillance research, survey data are commonly analyzed using regression methods; however, these methods have limited ability to examine complex relationships. In contrast, decision tree models are ideally suited for segmenting populations and examining complex interactions among factors, and their use within health research is growing. This article provides a methodological overview of decision trees and their application to youth mental health survey data. The performance of two popular decision tree techniques, the classification and regression tree (CART) and conditional inference tree (CTREE) techniques, is compared to traditional linear and logistic regression models through an application to youth mental health outcomes in the COMPASS study. Data were collected from 74 501 students across 136 schools in Canada. Anxiety, depression and psychosocial well-being outcomes were measured along with 23 sociodemographic and health behaviour predictors. Model performance was assessed using measures of prediction accuracy, parsimony and relative variable importance. Decision tree and regression models consistently identified the same sets of most important predictors for each outcome, indicating a general level of agreement between methods. Tree models had lower prediction accuracy but were more parsimonious and placed greater relative importance on key differentiating factors. Decision trees provide a means of identifying high-risk subgroups to whom prevention and intervention efforts can be targeted, making them a useful tool to address research questions that cannot be answered by traditional regression methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call