Abstract

Background: This study proposes a cardiovascular diseases (CVD) prediction model using machine learning (ML) algorithms based on the National Health Insurance Service-Health Screening datasets. Methods: We extracted 4699 patients aged over 45 as the CVD group, diagnosed according to the international classification of diseases system (I20–I25). In addition, 4699 random subjects without CVD diagnosis were enrolled as a non-CVD group. Both groups were matched by age and gender. Various ML algorithms were applied to perform CVD prediction; then, the performances of all the prediction models were compared. Results: The extreme gradient boosting, gradient boosting, and random forest algorithms exhibited the best average prediction accuracy (area under receiver operating characteristic curve (AUROC): 0.812, 0.812, and 0.811, respectively) among all algorithms validated in this study. Based on AUROC, the ML algorithms improved the CVD prediction performance, compared to previously proposed prediction models. Preexisting CVD history was the most important factor contributing to the accuracy of the prediction model, followed by total cholesterol, low-density lipoprotein cholesterol, waist-height ratio, and body mass index. Conclusions: Our results indicate that the proposed health screening dataset-based CVD prediction model using ML algorithms is readily applicable, produces validated results and outperforms the previous CVD prediction models.

Highlights

  • Cardiovascular disease (CVD) is the leading cause of death worldwide, accounting for approximately 17.9 million deaths annually which is about 30% of all global deaths [1,2].In Korea, CVD is rapidly increasing, because people’s lifestyle has changed, and the average age of the population has increased significantly

  • Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations

  • This study aimed to present a CVD prediction model using machine learning (ML) algorithms based on nationwide health screening datasets

Read more

Summary

Introduction

Cardiovascular disease (CVD) is the leading cause of death worldwide, accounting for approximately 17.9 million deaths annually which is about 30% of all global deaths [1,2].In Korea, CVD is rapidly increasing, because people’s lifestyle has changed, and the average age of the population has increased significantly. Cardiovascular disease (CVD) is the leading cause of death worldwide, accounting for approximately 17.9 million deaths annually which is about 30% of all global deaths [1,2]. CVD is one of the four major diseases in Korea, and ranks second in the leading causes of death, followed by cancer [3]. Extensive efforts have been made to analyze the causes of CVD [5]. Factors such as hypertension, diabetes, hyperlipidemia, and atherosclerosis have been noted as major factors causing CVD [6]. In addition to physiological and genetic risk factors, behavioral and psychosocial factors have been known as CVD risk factors. Psychosocial factors include education, financial status, social support, stress, anxiety, and depression [7]

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.