Abstract

Introduction. Patients with diabetes are exposed to various cardiovascular risk factors, which lead to an increased risk of cardiac complications. Therefore, the development of a diagnostic system for diabetes and cardiovascular disease (CVD) is a relevant research task. In addition, the identification of the most significant indicators of both diseases may help physicians improve treatment, speed the diagnosis, and decrease its computational costs.Aim. To classify subjects with different diabetes types, predict the risk of cardiovascular diseases in diabetic patients using machine learning methods by finding the correlational indicators.Materials and methods. The NHANES database was used following preprocessing and balancing its data. Machine learning methods were used to classify diabetes based on physical examination data and laboratory data. Feature selection methods were used to derive the most significant indicators for predicting CVD risk in diabetic patients. Performance optimization of the developed classification and prediction models was carried out based on different evaluation metrics.Results. The developed model (Random Forest) achieved the accuracy of 93.1 % (based on laboratory data) and 88 % (based on pysicical examination plus laboratory data). The top five most common predictors in diabetes and prediabetes were found to be glycohemoglobin, basophil count, triglyceride level, waist size, and body mass index (BMI). These results seem logical, since glycohemoglobin is commonly used to check the amount of glucose (sugar) bound to the hemoglobin in the red blood cells. For CVD patients, the most common predictors inlcude eosinophil count (indicative of blood diseases), gamma-glutamyl transferase (GGT), glycohemoglobin, overall oral health, and hand stiffness.Conclusion. Balancing the dataset and deleting NaN values improved the performance of the developed models. The RFC and XGBoost models achieved higher accuracy using gradient descending order to minimize the loss function. The final prediction is made using a weighted majority vote of all the decisions. The result was an automated system for predicting CVD risk in diabetic patients.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call