Abstract

Diabetes is the most common medical disorders that occur due to the malfunctioning of the pancreas. It increases the level of sugar in the body and poses a severe concern to human health by adversely affecting almost all major organs of the body, including kidney, heart, eyes, etc. The number of research works in the literature proves that machine learning techniques can increase the early detection of disease and decrease medical error rates to save human life. Developing an accurate and effective diabetes prediction model is always a challenge, as the medical dataset suffers from outliers and missing values. The aim of this study is to build an accurate and robust Diabetes Classification and Prediction Model (DCPM) on a dataset that suffers from the class imbalance problem and contains outliers and missing values. The proposed work devises an effective pre-processing technique to remove outliers, fill missing values, standardize data and select relevant features for model learning in a pipelined manner. The proposed pre-processing techniques were applied on the Pima Indian Diabetes (PID) dataset obtained from the University of California at Irvine (UCI) Repository. The K-NN classifier is optimized to find the optimum value of k and is then trained and evaluated on the most predictive set of features of the pre-processed PID dataset. The performance of the proposed model is assessed using classification accuracy, precision, recall and F1-score. The proposed approach is able to attain statistically good classification accuracy, recall, precision and F1-score as 92.28%, 92.36%, 92.38% and 92.31%, respectively. The proposed model outperforms existing state-of-the-art approaches in terms of accuracy. Therefore, the proposed DCPM can assist the medical experts by providing a quick, precise and reliable recommendation that can be considered while making a crucial decision about the health of a patient in the healthcare sector.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call