How do statistically informed investigators construct decision trees out of a bunch of data from many dichotomous tests (those with two distinct outcomes, such as normal and abnormal) to inform clinicians on making the best decisions for their patients?Recursive partitioning is an especially useful analytical tool in cohort studies conducted to develop or inform clinical pathways or algorithms. This multivariable approach builds decision trees that classify patients as high or low risk for a given condition or outcome using two or more dichotomous dependent variables.In comparison with other multivariable models, recursive partitioning can be more intuitive for clinicians than logistic regression models, does not require users to perform calculations, and assists in identifying clinical subgroups, variable interactions, and consequences of false positive and false negative results.Computer software for recursive partitioning will attempt various ways of splitting data to allow for the “best fit” and least misclassification of true positives and true negatives in the outcome of interest. The user may specify the “cost” of misclassification, such as allowing the ratio of false negatives to false positives to be no greater than a certain level.In Mathews et al’s study, discussed in this issue, children with temperatures ≥ 39°C were five times as likely to have pneumonia as those with lower temperatures. The investigators wanted to know what cutoff of temperature was most predictive of pneumonia, and whether any other risk factors were useful in decision making. The figure below illustrates their use of recursive partitioning. Using likelihood ratios, Mathews et al found that only maximal ED temperature was a useful predictor of chest x-ray findings of pneumonia in wheezing children, and that 38°C was the optimal cutoff temperature that best fit their data.A disadvantage of recursive partitioning is that it does not always work well for continuous variables, since these need to be transformed into dichotomous variables and thereby lose some of their information. So if risk for an outcome changes in a stepwise fashion as the continuous variable changes (eg, if there were an increase in risk of pneumonia for each degree of increase in temperature), another statistical method, logistic regression, would provide a more efficient use of data to predict the risk of disease.