Abstract

Acute Coronary Syndrome (ACS) is a disease that has a high mortality rate with a mortality percentage of 40% after 5 years from diagnosis. Despite the high mortality rate, the conventional process of overestimating ACS can be life-threatening. For this reason, several alternatives for prediagnosis have been investigated to reduce the detection of ACS intensively, one of which is by using a machine learning approach. The machine learning-based prediagnosis approach utilizes patient medical record data as input for making detection models. This approach can produce an optimal model when there is quite a lot of data and the labels have a fairly balanced comparison. However, in machine learning-based ACS detection studies, researchers often do not have balanced data between positive and negative labels that have the potential to cause overfitting. That problem occurs because obtaining additional data with specific labels is difficult. To solve the imbalanced problem in ACS detection, we generated synthetic ACS data using the K-Means SMOTE method. The synthesis data is used as training data to build an ensemble-based machine-learning model. In this study, we obtain an increase in the F1 score of more than 10% when compared to machine learning models that do not use the K-Means SMOTE as an oversampling process. In addition to the greater F1 score, the results obtained are relatively more resistant to overfitting because the data variations in the training set are more diverse.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call