Abstract

Chronic kidney disease (CKD) is a condition distinguished by structural and functional changes to the kidney over time. Studies show that 10% of adults worldwide are affected by some kind of CKD, resulting in 1.2 million deaths. Recently, CKD has emerged as a leading cause of mortality worldwide, making it necessary to develop a Computer-Aided Diagnostic (CAD) system to diagnose CKD automatically. Machine Learning (ML) based CAD system can be used by a clinician to automatically diagnoses mass people. Since ML models are considered a black box, it is also necessary to expose influential causes behind a model's prediction of a particular output. So that, a doctor can make a more rational decision based on the model's output and analysis of the features influence on the model. In this paper, we have used the XGBoost as the ML classifier to predict whether a patient has CKD or not. Using the XGBoost classifier, we have obtained an accuracy, precision, recall, and F1 score of 99.16{%}, 100{%}, 98.68{%}, and 99.33{%}, respectively using all 24 features. Furthermore, we have used Biogeography Based Optimization (BBO) algorithm to find an effective subset of the features. The BBO algorithm selected almost half of the initial features. We have obtained an accuracy, precision, recall, and F1 score of 98.33{%}, 100{%}, 97.36{%}, and 98.67{%}, respectively using only 13 features selected by the BBO algorithm. Finally, we have explained the impact of the feature on the ML models using the SHapley Additive exPlanations (SHAP) analysis. Using SHAP analysis and BBO algorithm, we have found that hemoglobin and albumin mostly contribute to the detection of CKD.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call