Exploring Early Prediction of Chronic Kidney Disease Using Machine Learning Algorithms for Small and Imbalanced Datasets

Andressa C M Da Silveira,Álvaro Sobrinho,Maria Eliete Pinheiro,Leandro Dias Da Silva,Evandro De Barros Costa,Angelo Perkusich

doi:10.3390/app12073673

Abstract

Chronic kidney disease (CKD) is a worldwide public health problem, usually diagnosed in the late stages of the disease. To alleviate such issue, investment in early prediction is necessary. The purpose of this study is to assist the early prediction of CKD, addressing problems related to imbalanced and limited-size datasets. We used data from medical records of Brazilians with or without a diagnosis of CKD, containing the following attributes: hypertension, diabetes mellitus, creatinine, urea, albuminuria, age, gender, and glomerular filtration rate. We present an oversampling approach based on manual and automated augmentation. We experimented with the synthetic minority oversampling technique (SMOTE), Borderline-SMOTE, and Borderline-SMOTE SVM. We implemented models based on the algorithms: decision tree (DT), random forest, and multi-class AdaBoosted DTs. We also applied the overall local accuracy and local class accuracy methods for dynamic classifier selection; and the k-nearest oracles-union, k-nearest oracles-eliminate, and META-DES for dynamic ensemble selection. We analyzed the models’ performances using the hold-out validation, multiple stratified cross-validation (CV), and nested CV. The DT model presented the highest accuracy score (98.99%) using the manual augmentation and SMOTE. Our approach can assist in designing systems for the early prediction of CKD using imbalanced and limited-size datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Apr 6, 2022
Citations: 19	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Exploring Early Prediction of Chronic Kidney Disease Using Machine Learning Algorithms for Small and Imbalanced Datasets

Abstract

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Data oversampling and imbalanced datasets: an investigation of performance for machine learning and feature engineering
Muhammad Mujahid ... Imran Ashraf
Journal of Big Data | VOL. 11
Muhammad Mujahid, et. al.Muhammad Mujahid ... Imran Ashraf
17 Jun 2024
Journal of Big Data | VOL. 11

An Endorsement of the Removal of Race From GFR Estimation Equations: A Position Statement From the National Kidney Foundation Kidney Disease Outcomes Quality Initiative
Holly J Kramer ... Michael V Rocco
American Journal of Kidney Diseases | VOL. 80
Holly J Kramer, et. al.Holly J Kramer ... Michael V Rocco
02 Sep 2022
American Journal of Kidney Diseases | VOL. 80

EVALUATION OF CLASSIFICATION ALGORITHMS WITH SOLUTION TO CLASS IMBALANCE PROBLEM ON ZAKAT DISTRIBUTION DATASET
Wan Nurshazelin Wan Shahidan ... Azlan Abdul Aziz
International Journal of Entrepreneurship and Management Practices | VOL. 7
Wan Nurshazelin Wan Shahidan, et. al.Wan Nurshazelin Wan Shahidan ... Azlan Abdul Aziz
30 Jun 2024
International Journal of Entrepreneurship and Management Practices | VOL. 7

Comparative Analysis of Machine Learning Models for Fitness Level Prediction with Imbalanced Dataset
Stephanie Chua ... Chia Inn Sii
-
Stephanie Chua, et. al.Stephanie Chua ... Chia Inn Sii
01 Dec 2022
01 Dec 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Exploring Early Prediction of Chronic Kidney Disease Using Machine Learning Algorithms for Small and Imbalanced Datasets

Abstract

Talk to us

Similar Papers

More From: Applied Sciences