Abstract

Diabetes is a chronic disease that occurs when the sugar level is too high in the body or when the body doesn't make enough insulin and it impacts each individual of all age groups. It has a captivating history that has increased significantly in recent years as a result of urbanization and affected millions of people worldwide. Undiagnosed diabetes can cause many life-threatening diseases which usually lead to the death of a person. So, the early detection of diabetes is very vital to maintain a healthy life and it can help to prevent complications and reduce patients' health risks. This paper undertakes to design a model which gives maximum accuracy by using different machine learning algorithms that help detect the disease in its early stage. For this purpose, used five classifiers which are Random Forest, Decision Tree, K-Nearest Neighbor, Naïve Bayes, and Deep learning, then apply the Vote ensemble approach that is considered “best practice” and is a part of the workflow and provides the best possible outcomes with the highest accuracy percentage. The informational data employed as a part of this analysis is taken from the Kaggle dataset of Early Diabetes Classification and preprocessed this all data on the RapidMiner Tool. The main point of this research is the implementation of the different ML based classification models to show their comparative analysis. Thus, by using these algorithms the diagnosis of diabetes is statistically evaluated and compared. The experimental outcomes show that in the vote ensemble, Random Forest with K-NN gives optimum results with the highest accuracy of 97.97% along with parameters like precision, f-measure, and sensitivity.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call