An efficient ensemble based machine learning approach for predicting Chronic Kidney Disease.

Divyanshi Chhabra,Mamta Juneja,Gautam Chutani

doi:10.2174/1573405620666230508104538

Abstract

Chronic kidney disease (CKD) is a long-term risk to one's health that can result in kidney failure. CKD is one of today's most serious diseases, and early detection can aid in proper treatment. Machine learning techniques have proven to be reliable in the early medical diagnosis. The paper aims to perform CKD prediction using machine learning classification approaches. The dataset used for the present study for detecting CKD was obtained from the machine learning repository at the University of California, Irvine (UCI). In this study, twelve machine learning-based classification algorithms with full features were used. Since the CKD dataset had a class imbalance issue, the Synthetic Minority Over-Sampling technique (SMOTE) was used to alleviate the problem of class imbalance and review the performance based on machine learning classification models using the K fold cross-validation technique. The proposed work compares the results of twelve classifiers with and without the SMOTE technique, and then the top three classifiers with the highest accuracy, Support Vector Machine, Random Forest, and Adaptive Boosting classification algorithms were selected to use the ensemble technique to improve performance. The accuracy achieved using a stacking classifier as an ensemble technique with cross-validation is 99.5%. The study provides an ensemble learning approach in which the top three best-performing classifiers in terms of cross-validation results are stacked in an ensemble model after balancing the dataset using SMOTE. This proposed technique could be applied to other diseases in the future, making disease detection less intrusive and cost-effective.

Full Text