Abstract

Diabetes is a chronic disease that continues to be a significant and global concern since it affects the entire population's health. It is a metabolic disorder that leads to high blood sugar levels and many other problems such as stroke, kidney failure, and heart and nerve problems. Several researchers have attempted to construct an accurate diabetes prediction model over the years. However, this subject still faces significant open research issues due to a lack of appropriate data sets and prediction approaches, which pushes researchers to use big data analytics and machine learning (ML)-based methods. Applying four different machine learning methods, the research tries to overcome the problems and investigate healthcare predictive analytics. The study's primary goal was to see how big data analytics and machine learning-based techniques may be used in diabetes. The examination of the results shows that the suggested ML-based framework may achieve a score of 86. Health experts and other stakeholders are working to develop categorization models that will aid in the prediction of diabetes and the formulation of preventative initiatives. The authors perform a review of the literature on machine models and suggest an intelligent framework for diabetes prediction based on their findings. Machine learning models are critically examined, and an intelligent machine learning-based architecture for diabetes prediction is proposed and evaluated by the authors. In this study, the authors utilize our framework to develop and assess decision tree (DT)-based random forest (RF) and support vector machine (SVM) learning models for diabetes prediction, which are the most widely used techniques in the literature at the time of writing. It is proposed in this study that a unique intelligent diabetes mellitus prediction framework (IDMPF) is developed using machine learning. According to the framework, it was developed after conducting a rigorous review of existing prediction models in the literature and examining their applicability to diabetes. Using the framework, the authors describe the training procedures, model assessment strategies, and issues associated with diabetes prediction, as well as solutions they provide. The findings of this study may be utilized by health professionals, stakeholders, students, and researchers who are involved in diabetes prediction research and development. The proposed work gives 83% accuracy with the minimum error rate.

Highlights

  • Of late, diabetes is one of the leading reasons for death in developing countries

  • Diabetes is one of the most prevalent diseases that develops as a result of a high amount of blood glucose or blood sugar in the bloodstream. e glucose in the blood is the most important energy source for the human body, providing it with the energy it needs to complete the full task. is energy is derived through insulin, which is produced with the assistance of the pancreas, which obtains energy from the consumption of food

  • K-Nearest Neighbor (KNN), support vector machine (SVM), logistic regression, and random forest machine learning (ML) techniques are performed on the Pima Indian Diabetes Database (PIDD) data set to investigate the prediction of diabetes. e test is conducted by taking various parameters such as glucose, blood pressure, and BMI [5] to achieve precision. e remaining work is bestowed as follows

Read more

Summary

Introduction

Diabetes is one of the leading reasons for death in developing countries. To find the solution for the crucial disease, the government and individuals are investing money in research studies. Improved approaches are developed to construct a hybrid early diabetic illness prediction system based on the information gathered from the various authors’ works. KNN, SVM, logistic regression, and random forest ML techniques are performed on the Pima Indian Diabetes Database (PIDD) data set to investigate the prediction of diabetes. It was discovered that the suggested random forest technique considerably improves the prediction efficiency of heart disease and diabetic illness. E analysis reveals the percentage of people affected by diabetes diseases It displays the information of the data set such as age, blood pressure, pregnancies, and glucose. Apart from that, it predicts how many people are affected by diabetes from 768. Few selected attributes such as Indian diabetes dataset pre-process data

Results
Machine Learning Classification Models
Summary and Conclusion
Future Work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call