Abstract

AbstractDiabetes is a chronic disease that has been impacting an increasing number of people throughout the years. Each year, it results in a huge number of deaths. Since late diagnosis results in severe health complications and a significant number of deaths each year, developing methods for early detection of this pathology is critical. As a result, early detection is critical. Machine learning (ML) techniques aid in the early detection and prediction of diabetes. However, ML models do not perform well with missing values in the dataset. Imputation of missing values improves the outcome. The article proposes an ensemble method with a strong emphasis on missing value imputation. Numerous ML models have been used to validate the proposed framework. The experimentation uses the Pima Indian Diabetes Dataset, which contains information about people with and without diabetes. TPR, FNR, PPV, FDR, overall accuracy, training time, and AUC are used to evaluate the performance of the 24 ML methods. The collected results demonstrate that subspace KNN outperforms with an accuracy of 85%. The collected data are confirmed systematically and orderly utilizing receiver operating characteristic (ROC) curves. Using missing value imputation for data pre-processing and classification has been shown to beat state-of-the-art algorithms in the diabetes detection sector.KeywordsDiabetes predictionMissing value imputationML model

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call