Detecting Ventricular Beats with Machine Learning Models

Stojancho Tudjarski,Marjan Gusev,Aleksandar Stankovski

doi:10.23919/mipro55190.2022.9803758

Abstract

This paper aims at modeling a classifier of Ventricular heartbeats by experimenting with the most advanced classic binary classifiers in different scenarios for feature engineering. Methodology: The results were acquired based on experimenting with XGBoost and Random Forest algorithms, as two of the most advanced classifiers not based on neural networks. Although the annotated ECG data sets contain records with several heartbeat classes, we focus on a model that would distinguish V from others (Non-V heartbeats). Considering that we are dealing with a highly imbalanced data set, we applied the SMOTE algorithm for data enrichment to provide a better-balanced data set for training the model. To acquire better results, we added new calculated features, with and without feature selection. For feature selection, we used the Fisher Selector algorithm. Data: We used MIT-BIH Arrhythmia benchmark database, with train/test split according to the patient-oriented splitting approach that separates the original dataset into two subsets with approximately equal sizes and distribution of heartbeat types. Conclusion: The best results are achieved with XGBoost algorithm with original feature set. We achieved precision of 91.36%, recall of 88.31% and F1 score of 89.81%. Results showed that oversampling does not provide significantly better overall model performance. Still, we would recommend this approach since in practice, when dealing with imbalanced data sets, this leads to more robust models that perform better with data outside the training and test sets, such as when the model is used in production.

Full Text