Abstract
Automated identification of advanced chronic kidney disease (CKD ≥ III) and of no known kidney disease (NKD) can support both clinicians and researchers. We hypothesized that identification of CKD and NKD can be improved, by combining information from different electronic health record (EHR) resources, comprising laboratory values, discharge summaries and ICD-10 billing codes, compared to using each component alone. We included EHRs from 785 elderly multimorbid patients, hospitalized between 2010 and 2015, that were divided into a training and a test (n = 156) dataset. We used both the area under the receiver operating characteristic (AUROC) and under the precision-recall curve (AUCPR) with a 95% confidence interval for evaluation of different classification models. In the test dataset, the combination of EHR components as a simple classifier identified CKD ≥ III (AUROC 0.96[0.93–0.98]) and NKD (AUROC 0.94[0.91–0.97]) better than laboratory values (AUROC CKD 0.85[0.79–0.90], NKD 0.91[0.87–0.94]), discharge summaries (AUROC CKD 0.87[0.82–0.92], NKD 0.84[0.79–0.89]) or ICD-10 billing codes (AUROC CKD 0.85[0.80–0.91], NKD 0.77[0.72–0.83]) alone. Logistic regression and machine learning models improved recognition of CKD ≥ III compared to the simple classifier if only laboratory values were used (AUROC 0.96[0.92–0.99] vs. 0.86[0.81–0.91], p < 0.05) and improved recognition of NKD if information from previous hospital stays was used (AUROC 0.99[0.98–1.00] vs. 0.95[0.92–0.97]], p < 0.05). Depending on the availability of data, correct automated identification of CKD ≥ III and NKD from EHRs can be improved by generating classification models based on the combination of different EHR components.
Highlights
Chronic kidney disease (CKD) is a major public health concern characterized by an increasing prevalence and associated with a high level of morbidity and mortality [1,2]
no known kidney disease (NKD) was associated with younger age, better kidney function and fewer co-morbidities compared to CKD ≥ III. (Table 1)
The results of our study demonstrate that laboratory values have the best performance for identifying CKD ≥ III and NKD from electronic health record (EHR) compared to discharge summaries and ICD-10 billing codes in an elderly multimorbid cohort of hospitalized patients
Summary
Chronic kidney disease (CKD) is a major public health concern characterized by an increasing prevalence and associated with a high level of morbidity and mortality [1,2]. Accurate identification of CKD or absence of kidney disease (NKD = no known kidney disease) is essential for clinical trials and epidemiological studies. In this context, a particular challenge is to store samples from hospitalized patients with known kidney status in clinical biorepositories, as part of Healthcare-Integrated Biobanking (HIB). At the time point of sample selection and storage, only a limited range of information regarding the respective patient phenotype is available Administrative data such as ICD-10 billing codes are often used in research trails to identify patients with CKD [4]. There is no ICD-10 billing code for NKD, as the purpose of ICD-10 billing codes is to indicate the presence of a disease
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have