Abstract
This study aims to compare the classification performance of statistical models on highly imbalanced kidney data. The health examination cohort database provided by the National Health Insurance Service in Korea is utilized to build models with various machine learning methods. The glomerular filtration rate (GFR) is used to diagnose chronic kidney disease (CKD). It is calculated using the Modification of Diet in Renal Disease method and classified into five stages (1, 2, 3A and 3B, 4, and 5). Different CKD stages based on the estimated GFR are considered as six classes of the response variable. This study utilizes two representative generalized linear models for classification, namely, multinomial logistic regression (multinomial LR) and ordinal logistic regression (ordinal LR), as well as two machine learning models, namely, random forest (RF) and autoencoder (AE). The classification performance of the four models is compared in terms of accuracy, sensitivity, specificity, precision, and F1-Measure. To find the best model that classifies CKD stages correctly, the data are divided into a 10-fold dataset with the same rate for each CKD stage. Results indicate that RF and AE show better performance in accuracy than the multinomial and ordinal LR models when classifying the response variable. However, when a highly imbalanced dataset is modeled, the accuracy of the model performance can distort the actual performance. This occurs because accuracy is high even if a statistical model classifies a minority class into a majority class. To solve this problem in performance interpretation, we not only consider accuracy from the confusion matrix but also sensitivity, specificity, precision, and F-1 measure for each class. To present classification performance with a single value for each model, we calculate the macro-average and micro-weighted values for each model. We conclude that AE is the best model classifying CKD stages correctly for all performance indices.
Highlights
Chronic kidney disease (CKD) is defined as kidney damage or the presence of a decreased glomerular filtration rate (GFR) for more than three months [1]
We investigated the national health check data of a 134,895 samples cohort and evaluated the performance of CKD classification using generalized linear models for classification, multinomial LR and ordinal LR, and machine learning algorithms, Random forest (RF) and AE
The average accuracy for the two machine learning models is slightly higher than that for the generalized linear models. When it comes to the performance interpretation on a highly imbalanced dataset, average sensitivity and average specificity should be considered as key criteria
Summary
Chronic kidney disease (CKD) is defined as kidney damage or the presence of a decreased glomerular filtration rate (GFR) for more than three months [1]. The National Kidney Foundation (NKF) created a guideline to help medical doctors identify the level of kidney disease and improve the quality of care for patients with kidney disease. The NKF presents the standard classification of kidney disease into five stages based on how well the kidneys can filter waste and excess fluid out of the blood. Kidneys are able to filter out waste from blood. Kidneys try to remove waste and may stop working altogether. Stage 3 is separated into two stages, stage 3a and stage 3b [2]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.