Abstract
Appropriate management of hypertensive patients relies on the accurate identification of clinically relevant features. However, traditional statistical methods may ignore important information in datasets or overlook possible interactions among features. Machine learning may improve the prediction accuracy and interpretability of regression models by identifying the most relevant features in hypertensive patients. We sought the most relevant features for prediction of cardiovascular (CV) events in a hypertensive population. We used the penalized regression models least absolute shrinkage and selection operator (LASSO) and elastic net (EN) to obtain the most parsimonious and accurate models. The clinical parameters and laboratory biomarkers were collected from the clinical records of 1,471 patients receiving care at Mostoles University Hospital. The outcome was the development of major adverse CV events. Cox proportional hazards regression was performed alone and with penalized regression analyses (LASSO and EN), producing three models. The modeling was performed using 10-fold cross-validation to fit the penalized models. The three predictive models were compared and statistically analyzed to assess their classification accuracy, sensitivity, specificity, discriminative power, and calibration accuracy. The standard Cox model identified five relevant features, while LASSO and EN identified only three (age, LDL cholesterol, and kidney function). The accuracies of the models (prediction vs. observation) were 0.767 (Cox model), 0.754 (LASSO), and 0.764 (EN), and the areas under the curve were 0.694, 0.670, and 0.673, respectively. However, pairwise comparison of performance yielded no statistically significant differences. All three calibration curves showed close agreement between the predicted and observed probabilities of the development of a CV event. Although the performance was similar for all three models, both penalized regression analyses produced models with good fit and fewer features than the Cox regression predictive model but with the same accuracy. This case study of predictive models using penalized regression analyses shows that penalized regularization techniques can provide predictive models for CV risk assessment that are parsimonious, highly interpretable, and generalizable and that have good fit. For clinicians, a parsimonious model can be useful where available data are limited, as such a model can offer a simple but efficient way to model the impact of the different features on the prediction of CV events. Management of these features may lower the risk for a CV event. Graphical Abstract In a clinical setting, with numerous biological and laboratory features and incomplete datasets, traditional statistical methods may ignore important information and overlook possible interactions among features. Our aim was to identify the most relevant features to predict cardiovascular events in a hypertensive population, using three different regression approaches for feature selection, to improve the prediction accuracy and interpretability of regression models by identifying the relevant features in these patients.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Medical & Biological Engineering & Computing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.