Abstract

Central MessageMachine learning is only modestly superior to logistic regression for prediction of cardiac surgery mortality, possibly because of low-dimensional predictors with weak nonlinear relationships.See Article page 2075. Machine learning is only modestly superior to logistic regression for prediction of cardiac surgery mortality, possibly because of low-dimensional predictors with weak nonlinear relationships. See Article page 2075. In 1997, we published the first study comparing coronary artery bypass grafting mortality risk prediction using standard logistic regression versus what was then a state-of-the-art multilayer perceptron neural network, a type of machine learning.1Lippmann R.P. Shahian D.M. Coronary artery bypass risk prediction using neural networks.Ann Thorac Surg. 1997; 63: 1635-1643Abstract Full Text Full Text PDF PubMed Scopus (76) Google Scholar The c-indices (receiver operating characteristic curve areas) were nearly identical (0.76) for logistic regression, neural networks, and a committee or ensemble classifier that combined estimates from the other 2 approaches, although the committee classifier had slightly better calibration. We hypothesized that these findings might indicate “absence of complex nonlinear relationships, at least among the variables presented to the network,” the latter caveat emphasizing the limitations in the available predictor variables. What has happened during the 23 years since our original study? Given the availability of newer machine learning approaches and vast improvements in computer memory and processing speeds, are there now more convincing demonstrations of the superiority of machine learning for cardiac surgery risk prediction? In the current of the Journal, Benedetto and colleagues2Benedetto U. Dimagli A. Sinha S. Cocomello L. Gibbison B. Caputo M. et al.Machine learning improves mortality risk prediction after cardiac surgery: systematic review and meta-analysis.J Thorac Cardiovasc Surg. 2022; 163: 2075-2087.e9Abstract Full Text Full Text PDF PubMed Scopus (10) Google Scholar report a systematic review and meta-analysis of 15 studies selected from more than 459 citations comparing logistic regression with various machine-learning approaches to predict postoperative mortality for cardiac surgery. These studies investigated whether new, more-complex machine-learning approaches could improve mortality risk prediction by automatically discovering nonlinear interactions among input variables. The results of this meta-analysis show modest improvements at best, and even those are problematic, given some obvious methodologic concerns. For example, the authors selected the best-performing machine-learning systems from each study, which may inflate the apparent improvements. They also include 2 studies with near-perfect performance (Jamaati and colleagues,3Jamaati H. Najafi A. Kahe F. Karimi Z. Ahmadi Z. Bolursaz M. et al.Assessment of the EuroSCORE risk scoring system for patients undergoing coronary artery bypass graft surgery in a group of Iranian patients.Indian J Crit Care Med. 2015; 19: 576-579Crossref PubMed Scopus (11) Google Scholar c-index of 0.986 for support vector machines; and Mejia and colleagues,4Mejia O.A.V. Antunes M.J. Goncharov M. Dallan L.R.P. Veronese E. Lapenna G.A. et al.Predictive performance of six mortality risk scores and the development of a novel model in a prospective cohort of patients undergoing valve surgery secondary to rheumatic fever.PLoS One. 2018; 13: e0199277Crossref PubMed Scopus (8) Google Scholar c-index of 0.982 using random forests). These appear to be nonrepeatable, unrealistically high levels of predictive accuracy. Insufficient details are provided to determine if these studies collected data from atypical patient populations or inadvertently trained on test data. Finally, except for our original 1997 study (80,606 Society of Thoracic Surgeons database patients overall)1Lippmann R.P. Shahian D.M. Coronary artery bypass risk prediction using neural networks.Ann Thorac Surg. 1997; 63: 1635-1643Abstract Full Text Full Text PDF PubMed Scopus (76) Google Scholar and subsequent studies by Nilsson and colleagues5Nilsson J. Ohlsson M. Thulin L. Höglund P. Nashef S.A. Brandt J. Risk factor identification and mortality prediction in cardiac surgery using artificial neural networks.J Thorac Cardiovasc Surg. 2006; 132: 12-19Abstract Full Text Full Text PDF PubMed Scopus (67) Google Scholar (18,362 patients) and Tu and colleagues6Tu J.V. Weinstein M.C. McNeil B.J. Naylor C.D. Predicting mortality after coronary artery bypass surgery: what do artificial neural networks learn? The Steering Committee of the Cardiac Care Network of Ontario.Med Decis Making. 1998; 18: 229-235Crossref PubMed Scopus (37) Google Scholar (15,608 patients), small sample size is a pervasive limitation of most studies in this meta-analysis. Many of the 15 studies had very few patients (fewer than 1000) and mortality outcomes (12 to 37), providing little opportunity for machine-learning systems to learn new relationships. To more fairly evaluate machine-learning approaches, we selected the most relevant studies in this meta-analysis that had well-described, statistically justified methodologies, more than 1000 cardiac procedures (with a majority dedicated to training), and a larger number of mortality end points, conditions under which machine learning might realistically be able to learn more complex interactions. Six of the 15 studies cited by Benedetto and colleagues2Benedetto U. Dimagli A. Sinha S. Cocomello L. Gibbison B. Caputo M. et al.Machine learning improves mortality risk prediction after cardiac surgery: systematic review and meta-analysis.J Thorac Cardiovasc Surg. 2022; 163: 2075-2087.e9Abstract Full Text Full Text PDF PubMed Scopus (10) Google Scholar meet these criteria, as summarized in Table 1. One study10Nouei M.T. Kamyad A.V. Sarzaeem M. Ghazalbash S. Developing a genetic fuzzy system for risk assessment of mortality after cardiac surgery.J Med Syst. 2014; 38: 102Crossref PubMed Scopus (8) Google Scholar with more than 1000 cases was omitted because its Figure 4 suggests incorrect evaluation of receiver operating characteristic curve areas.Table 1Selected results from 6 larger, well-documented studiesStudyYearTotal cases availableLogistic regression (c-index)Best machine learning (c-index)Improvement in c-index: machine learning vs logistic regressionLippmann and Shahian1Lippmann R.P. Shahian D.M. Coronary artery bypass risk prediction using neural networks.Ann Thorac Surg. 1997; 63: 1635-1643Abstract Full Text Full Text PDF PubMed Scopus (76) Google Scholar199780,6060.760.760Tu et al6Tu J.V. Weinstein M.C. McNeil B.J. Naylor C.D. Predicting mortality after coronary artery bypass surgery: what do artificial neural networks learn? The Steering Committee of the Cardiac Care Network of Ontario.Med Decis Making. 1998; 18: 229-235Crossref PubMed Scopus (37) Google Scholar199815,6080.770.780.01Nilsson et al5Nilsson J. Ohlsson M. Thulin L. Höglund P. Nashef S.A. Brandt J. Risk factor identification and mortality prediction in cardiac surgery using artificial neural networks.J Thorac Cardiovasc Surg. 2006; 132: 12-19Abstract Full Text Full Text PDF PubMed Scopus (67) Google Scholar200618,3620.80.810.01Rahman et al7Rahman H.A.A. Wah Y.B. Khairudin Z. Abdullah N.N. Comparison of predictive models to predict survival of cardiac surgery patients.in: 2012 International Conference on Statistics in Science, Business and Engineering (ICSSBE). 2012: 1-5Crossref Scopus (5) Google Scholar,∗The c-index scores for this study do not appear in the original paper but are cited in the meta-analysis.2 We cannot confirm their accuracy.201249760.890.910.02Mendes et al8Mendes R.G. de Souza C.R. Machado M.N. Correa P.R. Di Thommazo-Luporini L. Arena R. et al.Predicting reintubation, prolonged mechanical ventilation and death in post-coronary artery bypass graft surgery: a comparison between artificial neural networks and logistic regression models.Arch Med Sci. 2015; 11: 756-763Crossref PubMed Scopus (8) Google Scholar201513150.860.85−0.01Allyn et al9Allyn J. Allou N. Augustin P. Philip I. Martinet O. Belghiti M. et al.A comparison of a machine learning model with EuroSCORE II in predicting mortality after elective cardiac surgery: a decision curve analysis.PLoS One. 2017; 12: e0169772Crossref PubMed Scopus (83) Google Scholar201765200.740.80.06∗ The c-index scores for this study do not appear in the original paper but are cited in the meta-analysis.2Benedetto U. Dimagli A. Sinha S. Cocomello L. Gibbison B. Caputo M. et al.Machine learning improves mortality risk prediction after cardiac surgery: systematic review and meta-analysis.J Thorac Cardiovasc Surg. 2022; 163: 2075-2087.e9Abstract Full Text Full Text PDF PubMed Scopus (10) Google Scholar We cannot confirm their accuracy. Open table in a new tab Among these selected studies, machine learning performed slightly worse in one.8Mendes R.G. de Souza C.R. Machado M.N. Correa P.R. Di Thommazo-Luporini L. Arena R. et al.Predicting reintubation, prolonged mechanical ventilation and death in post-coronary artery bypass graft surgery: a comparison between artificial neural networks and logistic regression models.Arch Med Sci. 2015; 11: 756-763Crossref PubMed Scopus (8) Google Scholar In the others, improvements in c-indices for a variety of machine-learning approaches were small, ranging from 0.0 in our large study1Lippmann R.P. Shahian D.M. Coronary artery bypass risk prediction using neural networks.Ann Thorac Surg. 1997; 63: 1635-1643Abstract Full Text Full Text PDF PubMed Scopus (76) Google Scholar to 0.06 in the study of Allyn and colleagues,9Allyn J. Allou N. Augustin P. Philip I. Martinet O. Belghiti M. et al.A comparison of a machine learning model with EuroSCORE II in predicting mortality after elective cardiac surgery: a decision curve analysis.PLoS One. 2017; 12: e0169772Crossref PubMed Scopus (83) Google Scholar the latter using data from only a single institution, which raises obvious concerns regarding overfitting. These findings suggest that with typically available data, machine-learning classifiers provide only a small benefit from learning more complex nonlinear relationships, much the same as we found nearly a quarter century ago. This is in sharp contrast to areas such as speech or visual object recognition, where modern machine-learning approaches lead to substantial performance improvements, facilitated by extremely large, high-dimensional training and test data sets. Decades after our original study, our interpretation of more recent studies is less positive than that of Benedetto and colleagues.2Benedetto U. Dimagli A. Sinha S. Cocomello L. Gibbison B. Caputo M. et al.Machine learning improves mortality risk prediction after cardiac surgery: systematic review and meta-analysis.J Thorac Cardiovasc Surg. 2022; 163: 2075-2087.e9Abstract Full Text Full Text PDF PubMed Scopus (10) Google Scholar Newer methodologies and dramatically increased computing power have not substantially improved the predictive accuracy of machine learning compared with standard logistic regression for cardiac surgery mortality prediction. In part, these findings may reflect the design of the meta-analysis, which included too many small-sample studies with limited outcomes, scenarios in which machine learning was unlikely to add value. However, even allowing for these and other methodologic limitations previously described, we suspect that the incremental benefit of machine-learning approaches for cardiac surgery mortality prediction actually are rather modest. Cardiac surgery mortality rates have fallen dramatically, limiting the number of mortality outcomes available for machine-learning systems to be trained. Further, compared with other areas such as interpretation of radiographs, most data available for health care risk prediction are low dimensional, often represented in a binary or categorical format and only weakly predictive, even by trained human experts. Incorporating additional sources of relevant, higher-dimensional data would obviously be advantageous. Machine-learning approaches might demonstrate greater incremental value in areas such as congenital cardiac surgery, where there are many more complex combinations of diagnoses, procedures, and procedure-specific predictors. It may also be more fruitful to study machine-learning performance for more common, non-fatal complications rather than for mortality, which fortunately is a rare outcome. Machine learning improves mortality risk prediction after cardiac surgery: Systematic review and meta-analysisThe Journal of Thoracic and Cardiovascular SurgeryVol. 163Issue 6PreviewInterest in the usefulness of machine learning (ML) methods for outcomes prediction has continued to increase in recent years. However, the advantage of advanced ML model over traditional logistic regression (LR) remains controversial. We performed a systematic review and meta-analysis of studies comparing the discrimination accuracy between ML models versus LR in predicting operative mortality following cardiac surgery. Full-Text PDF

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call