Abstract

Although previous studies have shown that there are differences in heart disease between men and women, the importance of some specific physical and chemical factors in the prediction of heart disease in different genders has not been clearly clarified. In this research, K-means clustering, multiple linear regression, logistic regression and random forest are adopted to analyze the UCI Heart Disease Data Set, which contains various physical and chemical indicators worth studying. The results demonstrate that exercise induced angina is more significant to the judgement of heart disease in women, while number of major vessels colored by fluoroscopy is more significant to the judgement of heart disease in men and type of chest pain is a statistically significant variable for both men and women. Thalassemia, ST depression induced by exercise relative to rest, greatest number of heartbeats per minute, age, resting blood pressure also have reference value for the judgment of heart disease. In terms of each model's fit to heart disease prediction, for women, the accuracy of random forest is the first, logistic regression is the second, and multiple linear regression is the third, while for men, the accuracy of random forest is the first, multiple linear regression is the second, and logistic regression is the third. These conclusions are an optimization of previous studies, and to a certain extent reflect that this study is of great significance to the prevention of heart disease in different groups of people.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call