Abstract

Acute kidney injury (AKI) is strongly associated with adverse clinical outcomes including prolonged hospitalization, progression to CKD, and death. Diagnosis of AKI relies on detection of changes in serum creatinine (sCr) and urine output, both of which lag days behind renal injury and are unreliable at initial presentation. Here, we utilized data mining and machine learning methods to develop a predictive model for AKI with capacity for identifying ED patients at high risk for development of AKI within 7 days of their ED visit. A retrospective cross-sectional cohort of ED visits from 3 hospitals over 2 years was generated and used for model derivation and out-of-sample validation. Clinical data for all adult ED visits where initial sCr measurements were available at index visit and again within 7 days of ED departure were extracted from a relational database that underlies our electronic health record (EHR) by an experienced data user. Primary outcome for prediction was Stage 2 AKI within 7 days of ED visit, defined according to sCr-based Kidney Disease Improving Global Outcomes (KDIGO) criteria (sCr increase to ≥ 2 times baseline). Secondary outcomes included KDIGO Stage 1 AKI (sCr increase of ≥0.3 mg/dl above baseline or ≥1.5 times baseline) and Stage 3 AKI (sCr increase to ≥ 3 times baseline or to ≥ 4.0 mg/dl). Predictor variables extracted from the EHR included vital signs, laboratory results, chief complaints, demographics, past medical history, active problems, home medications and ED medication administrations. Only EHR data available prior to prediction, made at time of first metabolic panel result, was included. Predictor variables were normalized as follows: ED vital signs and laboratory results were processed to minimum and maximum values, nephrotoxic and nephroprotective medications were grouped by pharmacologic class and least absolute shrinkage and selection operator (LASSO) feature selection processing applied to chief complaints and active problems identify variables with predictive value for AKI. Multiple machine learning models (logistic regression, decision tree, linear discriminant analysis, support vector machine, and random forest) were generated and tested in the prediction of our primary outcome. All were developed using a training dataset comprised of 90% of encounters and evaluated in the remaining encounters using 10-fold cross validation. Performance of each model was assessed using binary classification measures and receiver operator curve (ROC) analyses. Our final cohort included 127,183 ED visits by 72,539 unique patients. Median age was 58 years (IQR: 43-71) and most common high-risk comorbidities were hypertension (51.8%) and heart failure (9.8%). Incidence of AKI in our cohort was as follows: Stage 1: 12.4%, Stage 2: 1.5%, Stage 3: 1.0%. Predictive model performance as measured by area under the ROC analysis ranged from 0.661 (95% CI: 0.637 - 0.685) using decision tree to 0.771 (95% CI: 0.759 - 0.783) using random forest. Machine learning methods applied to EHR data identified ED patients at high risk for AKI well before patients met diagnostic criteria. The model developed here, when paired with nephroprotective point-of-care clinical decision support, has potential to improve outcomes for this patient population.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call