Abstract

The intricate and multifaceted nature of diabetes disrupts the body’s crucial glucose processing mechanism, which serves as a fundamental energy source for the cells. This research aims to predict the occurrence of diabetes in individuals by harnessing the power of machine learning algorithms, utilizing the PIMA diabetes dataset. The selected algorithms employed in this study encompass Decision Tree, K-Nearest Neighbor, Random Forest, Logistic Regression, and Support Vector Machine. To execute the experiments, two software tools, namely Waikato Environment for Knowledge Analysis (WEKA) version 3.8.1 and Python version 3.10, were utilized. To evaluate the performance of the algorithms, several metrics were employed, including true positive rate, false positive rate, precision, recall, F-measure, Matthew’s correlation coefficient, receiver operating characteristic area, and precision–recall curves area. Furthermore, various errors such as Mean Absolute Error, Root Mean Squared Error, Relative Absolute Error, and Root Relative Squared Error were examined to assess the accuracy of the models. Upon conducting the experiments, it was observed that Logistic Regression outperformed the other techniques, exhibiting the highest precision of 81 percent using Python and 80.43 percent using WEKA. These findings shed light on the efficacy of machine learning in predicting diabetes and highlight the potential of Logistic Regression as a valuable tool in this domain.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.