Machine Learning to Predict Mortality and Critical Events in a Cohort of Patients With COVID-19 in New York City: Model Development and Validation.

Akhil Vaid,Riccardo Miotto,Jagat Narula,Barbara Murphy,Felix Richter,Prem Timsina,Eddye Golden,Manish Paranjpe,Joseph Finkelstein,Laura Huckins,Edgar Argulian,Jessica K De Freitas,Andrew Kasarskis,Ishan Paranjpe,Dara Meyer,Sulaiman Somani,Emilia Bagiella,Manbir Singh,Matteo Danieletto,Arash Kia,Shan Zhao,Patricia Kovatch,Nidhi Naik,Anuradha Lala,Bethany Percha,Adam Russak ,Robert Freeman ,Carlos Cordon‐Cardo ,Judy H Cho ,Noam D Beckmann ,Judith A Aberg ,Eric E Schadt ,Alexander Charney ,Samuel Lee ,Kipp W Johnson ,Fayzan Chaudhry ,Allan C Just ,Valentín Fuster ,Dennis S Charney ,Eric J Nestler ,David L Reich ,Erwin P Böttinger ,Carol R Horowitz ,Zahi A Fayad ,Girish N Nadkarni ,Paul O’reilly ,Matthew A Levin ,Benjamin S Glicksberg

doi:10.2196/24018

Abstract

BackgroundCOVID-19 has infected millions of people worldwide and is responsible for several hundred thousand fatalities. The COVID-19 pandemic has necessitated thoughtful resource allocation and early identification of high-risk patients. However, effective methods to meet these needs are lacking.ObjectiveThe aims of this study were to analyze the electronic health records (EHRs) of patients who tested positive for COVID-19 and were admitted to hospitals in the Mount Sinai Health System in New York City; to develop machine learning models for making predictions about the hospital course of the patients over clinically meaningful time horizons based on patient characteristics at admission; and to assess the performance of these models at multiple hospitals and time points.MethodsWe used Extreme Gradient Boosting (XGBoost) and baseline comparator models to predict in-hospital mortality and critical events at time windows of 3, 5, 7, and 10 days from admission. Our study population included harmonized EHR data from five hospitals in New York City for 4098 COVID-19–positive patients admitted from March 15 to May 22, 2020. The models were first trained on patients from a single hospital (n=1514) before or on May 1, externally validated on patients from four other hospitals (n=2201) before or on May 1, and prospectively validated on all patients after May 1 (n=383). Finally, we established model interpretability to identify and rank variables that drive model predictions.ResultsUpon cross-validation, the XGBoost classifier outperformed baseline models, with an area under the receiver operating characteristic curve (AUC-ROC) for mortality of 0.89 at 3 days, 0.85 at 5 and 7 days, and 0.84 at 10 days. XGBoost also performed well for critical event prediction, with an AUC-ROC of 0.80 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. In external validation, XGBoost achieved an AUC-ROC of 0.88 at 3 days, 0.86 at 5 days, 0.86 at 7 days, and 0.84 at 10 days for mortality prediction. Similarly, the unimputed XGBoost model achieved an AUC-ROC of 0.78 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. Trends in performance on prospective validation sets were similar. At 7 days, acute kidney injury on admission, elevated LDH, tachypnea, and hyperglycemia were the strongest drivers of critical event prediction, while higher age, anion gap, and C-reactive protein were the strongest drivers of mortality prediction.ConclusionsWe externally and prospectively trained and validated machine learning models for mortality and critical events for patients with COVID-19 at different time horizons. These models identified at-risk patients and uncovered underlying relationships that predicted outcomes.

Highlights

Despite substantial, organized efforts to prevent disease spread, over 23 million people have tested positive for SARS-CoV-2 worldwide, and the World Health Organization has reported more than 800,000 deaths from the virus to date [1,2,3,4]
At 7 days, acute kidney injury on admission, elevated lactate dehydrogenase (LDH), tachypnea, and hyperglycemia were the strongest drivers of critical event prediction, while higher age, anion gap, and C-reactive protein were the strongest drivers of mortality prediction
More recent studies have accounted for fundamental aspects of machine learning but are limited in scope [13,18,19,20,21,22]. These studies lack either temporal benchmarks, interhospital or prospective validation, systematic evaluation of multiple models, consideration of covariate correlations, or assessment of the impact of the imputed data. With these needs in mind, we report the development of a boosted decision tree–based machine learning model trained on electronic health records from patients confirmed to have COVID-19 at a single center in the Mount Sinai Health System (MSHS) in New York City to predict critical events and mortality

Summary

Introduction

Despite substantial, organized efforts to prevent disease spread, over 23 million people have tested positive for SARS-CoV-2 worldwide, and the World Health Organization has reported more than 800,000 deaths from the virus to date [1,2,3,4]. These studies lack either temporal benchmarks, interhospital or prospective validation, systematic evaluation of multiple models, consideration of covariate correlations, or assessment of the impact of the imputed data With these needs in mind, we report the development of a boosted decision tree–based machine learning model trained on electronic health records from patients confirmed to have COVID-19 at a single center in the Mount Sinai Health System (MSHS) in New York City to predict critical events and mortality. To assess both interhospital and temporal generalizability, we first externally validated this algorithm to four other hospital centers.

Methods

Results

Discussion

Conclusion