Abstract

The research presented in this paper focuses on the application of machine learning techniques for early detection of diabetes, without the need for clinic-dependent data. Utilizing a dataset of 253,680 examples sourced from the Behavioral Risk Factor Surveillance System (BRFSS), the study employs a variety of machine learning models, including Decision Tree, Random Forest, XGBoost, Neural Networks, SVM, and Naive Bayes. The paper highlights the significance of early diabetes detection and the potential of machine learning in making this process more accessible and efficient. The dataset underwent extensive preprocessing, including under-sampling to address imbalance and feature engineering to enhance model performance. The paper meticulously discusses the employed preprocessing techniques, providing insights into the importance of handling data imbalance and feature selection in machine learning applications for healthcare. The neural network model emerged as the top-performing model, achieving an accuracy of 88.76%. This result underscores the potential of machine learning in diabetes detection. We believe that this is fruitful as most people will avoid visiting the clinic to check for diabetes because of costs and loss of time. In conclusion, whilst we believe that this approach is beneficial, we suggest that this model only to be used as a possible indicator with the need to visit the doctor to fully confirm the presence of diabetes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call