Abstract
Early diagnosis and treatment of myocardial infarction (MI) can significantly reduce the severity of the disease. Disease data are often imbalanced, which can lead to poor prediction outcomes when using conventional models. Therefore, developing a risk prediction model for MI with imbalanced datasets has become challenging. This paper presents a novel model called 2GDNN-FL-Stacked, which aims to address the issue of predicting the risk of MI in imbalanced data. Our group mitigates the impact of data imbalance on the model by employing random under-sampling and cost-sensitive techniques. We improve the model's identification capabilities by stacking and combining 2GDNN-FL, CatBoost, RandomForest, and LightGBM. Our model's Matthews Correlation Coefficient(MCC), F1-score, and Area Under the ROC Curve(AUC) scores increased by 0.87% 15.70%, 0.55% 9.81%, and 0.75% 8.11% respectively, compared to some baseline models, which represent a significant improvement over the performance of a single model on imbalanced datasets. This paper demonstrates the effectiveness of each component through ablation experiments, showing that removing either component affects model performance and proves the efficacy of all components. The method offers new insights into predicting heart attack risks and has the potential to offer potent assistance in making clinical decisions.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have