Decision tree (DT) models provide a transparent approach to prediction of patient’s outcomes within a probabilistic framework. Averaging over DT models under certain conditions can deliver reliable estimates of predictive posterior probability distributions, which is of critical importance in the case of predicting an individual patient’s outcome. Reliable estimations of the distribution can be achieved within the Bayesian framework using Markov chain Monte Carlo (MCMC) and its Reversible Jump extension enabling DT models to grow to a reasonable size. Existing MCMC strategies however have limited ability to control DT structures and tend to sample overgrown DT models, making unreasonably small partitions, thus deteriorating the uncertainty calibration. This happens because the MCMC explores a DT model parameter space within a limited knowledge of the distribution of data partitions. We propose a new adaptive strategy which overcomes this limitation, and show that in the case of predicting trauma outcomes the number of data partitions can be significantly reduced, so that the unnecessary uncertainty of estimating the predictive posterior density is avoided. The proposed and existing strategies are compared in terms of entropy which, being calculated for predicted posterior distributions, represents the uncertainty in decisions. In this framework, the proposed method has outperformed the existing sampling strategies, so that the unnecessary uncertainty in decisions is efficiently avoided.
Read full abstract