Abstract

Diabetes is one of the deadly non-communicable diseases that can attack humans. According to data from the World Health Organization (WHO), diabetes has killed at least 2 million people throughout 2019. Many recordings of each phase and condition of diabetes patients are done to support research. One of the most updated records of diabetes patients is the early stage diabetes risk prediction dataset. This dataset was released by the uci repository in late 2020 by the Diabetes Hospital in Bangladesh. Classification in data mining is a science that can extract data to look for patterns or data models to gain new knowledge. Several classification algorithms that are widely used and proven to be able to handle large data include K-NN, Naïve Bayes, and Decission Tree. This study compares the three algorithms to classify early stage diabetes risk prediction dataset. From the research results, the decision tree is the best algorithm for classifying diabetes datasets with an accuracy rate of 95.96%. Next is the KNN algorithm with an accuracy rate of 92.5%. Meanwhile, naïve Bayes only produces an accuracy rate of 86.92%. From this comparison it is known that the decision tree is the best algorithm for classifying the early stage diabetes risk prediction dataset with an accuracy rate of 95.96%.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.