Abstract

The challenge of imbalanced datasets in machine learning, particularly in the medical and public health sectors, necessitates innovative solutions that enhance predictive accuracy and reduce misclassification costs. This paper introduces the Enhanced Cost Sensitive Ensemble Learning (ECSEL), a novel approach that combines cost-sensitive learning with ensemble techniques to address the imbalance inherent in many critical datasets. We evaluate ECSEL on two specific datasets: the Framingham Heart Study dataset, which is pivotal in cardiovascular disease prediction, and a dataset focused on COVID-19 infection forecasting, relevant for public health responses. The results demonstrate that ECSEL significantly outperforms traditional methods like SMOTE, D-SMOTE, and BP-SMOTE by improving accuracy, precision, recall, and ROC-AUC values, particularly reducing Type II errors in scenarios where the cost of false negatives is exceptionally high. The method’s effectiveness is showcased through its superior performance in predicting cardiovascular risks and its robustness in forecasting the spread of COVID-19, reflecting its applicability and potential in real-world settings.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call