Abstract

Diabetes mellitus is one of the chronic diseases in the world. As per the WHO, this disease affects 422 million people and causes 1.6 million deaths per year. Ignorance of diabetes mellitus diagnosis may cause different health issues such as heart attacks, vision problems, and many more. Classification algorithms are being used in various domains like business, education, recommendation system, healthcare, etc. In this study, the Pima Indian Diabetes Dataset is used, which consists of 768 records. Firstly, data is cleaned by replacing outliers and missing values with the median, and then 5 feature selection techniques are applied in combination with 15 classification algorithms using Python. Classification algorithms applied are compared on the basis of precision, accuracy, recall, and f1-score with k-fold (K = 2, 4, 5, and 10) cross-validation technique. Our study finds that the multilayer perceptron classifier is giving the maximum precision of 74.45%, accuracy of 78.70%, recall of 71.26%, and f1-score of 72.82% with linear discriminant feature selection technique.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.