Abstract

Heart disease is an international public health issue and a significant health risk for many people. The World Health Organization (WHO) reported that heart disease has been identified as one of the primary causes of death. Owing to its rapid development and application in many areas, Machine Learning has become an effective technique for predicting heart disease. Machine learning, along with large-scale medical data and advanced algorithms, can assist healthcare professionals in accurately predicting the risk of heart disease, thus providing early intervention and treatment for patients. This research paper uses the “heart_2020_cleaned.csv” dataset, containing 319795 instances and 18 attributes, of which 70% of instances were randomly selected for the training set and 30% for testing. Applying machine learning algorithms in data mining such as Decision Tree (DT), LightGBM, Random Forest (RF) and Logistic Regression (LR) to forecast heart disease. Before constructing models, data cleaning, feature selection and hyperparameter tuning processes were done, aiming to explore the potential patterns among data. Comparative Analysis was conducted on the external test set to compare the prediction performance of different models at the same level. The result reported that the highest accuracy achieved with LightGBM was 76.9%, followed by Logistic Regression and Random Forest, with Decision Tree being the worst.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call