Abstract
ABSTRACT Objective: Given the association between vitamin D deficiency and risk for cardiovascular disease, we used machine learning approaches to establish a model to predict the probability of deficiency. Determination of serum levels of 25-hydroxy vitamin D (25(OH)D) provided the best assessment of vitamin D status, but such tests are not always widely available or feasible. Thus, our study established predictive models with high sensitivity to identify patients either unlikely to have vitamin D deficiency or who should undergo 25(OH)D testing. Methods: We collected data from 1002 hypertensive patients from a Spanish university hospital. The elastic net regularization approach was applied to reduce the dimensionality of the dataset. The issue of determining vitamin D status was addressed as a classification problem; thus, the following classifiers were applied: logistic regression, support vector machine (SVM), random forest, naive Bayes, and Extreme Gradient Boost methods. Classification accuracy, sensitivity, specificity, and predictive values were computed to assess the performance of each method. Results: The SVM-based method with radial kernel performed better than the other algorithms in terms of sensitivity (98%), negative predictive value (71%), and classification accuracy (73%). Conclusion: The combination of a feature-selection method such as elastic net regularization and a classification approach produced well-fitted models. The SVM approach yielded better predictions than the other algorithms. This combination approach allowed us to develop a predictive model with high sensitivity but low specificity, to identify the population that could benefit from laboratory determination of serum levels of 25(OH)D.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have