Abstract

Abstract Background Risk prediction plays a central role in clinical decision-making for patients undergoing cardiac surgery. The logistic EuroSCORE has demonstrated a dangerous calibration drift with the changing patient case-mix, resulting in a significant overestimation of mortality and risk-averse practice. Despite these limitations, it continues being used in the United Kingdom due to a lack of alternative validated models. It is urgent to find a replacement for EuroSCORE with a better calibrated prediction model. Machine learning models are increasingly used for risk prediction in medicine due to their potential of overcoming limitations of regression models. Precisely quantifying the risk of in-hospital mortality may better inform patient-centred decision-making and direct targeted quality improvement interventions. Methods This is a retrospective monocentric cohort study using prospectively collected fully anonymised data from the National Adult Cardiac Surgery Audit database, restricted to patients undergoing adult cardiac surgery at our institute from 1996 to 2017 (n=28,761). The aim was to develop a predictive model with improved discriminatory power and calibration using machine learning methods. Model calibration was assessed using the calibration belt method. Discrimination power of each model (area under the receiver operating characteristic curve [AUC]) was compared with the logistic EuroSCORE using the De Long's test. Results A time series of the observed:expected (O:E) ratio for the logistic EuroSCORE showed a linear decrease with a slope of −7.4x10–3. The calibration belt showed a significant risk overestimation across all risk categories (p<0.001). Model discrimination was excellent over time, with a marginal but significant linear trend in reducing the AUC (p=0.03). Although miscalibration was detected for all models (p<0.05), neural network achieved the best calibration with a test statistic of 13.3, followed by logistic regression (18.0), and EuroSCORE (228.7). The neural network achieved the highest AUC (0.82, 95% CI 0.78–0.85) of all models and was marginally non-significantly higher than that of the logistic EuroSCORE (0.79, 95% CI 0.75–0.83, p=0.056). Conclusion Our neural network model of cardiac surgery in-hospital mortality achieves slightly improved discriminatory power and significantly better calibration compared to that of EuroSCORE, making it more appropriate for dealing with the changing patient case-mix. Further model training on larger datasets with larger demographics is necessary. Clinical implementation of such models may reduce risk of overestimation of mortality. Funding Acknowledgement Type of funding source: None

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.