Background: A machine learning prognostic mortality scoring system was developed to address challenges in patient selection for clinical trials within the Intensive Care Unit (ICU) environment. The algorithm incorporates Red blood cell Distribution Width (RDW) data and other demographic characteristics to predict ICU mortality alongside existing ICU mortality scoring systems like Simplified Acute Physiology Score (SAPS). Methods: The developed algorithm, defined as a Mixed-effects logistic Random Forest for binary data (MixRFb), integrates a Random Forest (RF) classification with a mixed-effects model for binary outcomes, accounting for repeated measurement data. Performance comparisons were conducted with RF and the proposed MixRFb algorithms based solely on SAPS scoring, with additional evaluation using a descriptive receiver operating characteristic curve incorporating RDW's predictive mortality ability. Results: MixRFb, incorporating RDW and other covariates, outperforms the SAPS-based variant, achieving an area under the curve of 0.882 compared to 0.814. Age and RDW were identified as the most significant predictors of ICU mortality, as reported by the variable importance plot analysis. Conclusions: The MixRFb algorithm demonstrates superior efficacy in predicting in-hospital mortality and identifies age and RDW as primary predictors. Implementation of this algorithm could facilitate patient selection for clinical trials, thereby improving trial outcomes and strengthening ethical standards. Future research should focus on enriching algorithm robustness, expanding its applicability across diverse clinical settings and patient demographics, and integrating additional predictive markers to improve patient selection capabilities.
Read full abstract