Abstract
ObjectiveTo investigate the impact of atrial flutter (Afl) in the atrial arrhythmias classification task. We additionally advocate the use of a subject-based split for future studies in the field in order to avoid within-subject correlation which may lead to over-optimistic inferences. Finally, we demonstrate the effectiveness of the classifiers outside of the initially studied circumstances, by performing an inter-dataset model evaluation of the classifiers in data from different sources. MethodsECG signals of two private and three public (two MIT-BIH and Chapman ecgdb) databases were preprocessed and divided into 10s segments which were then subject to feature extraction. The created datasets were divided into a training and test set in two ways, based on a random split and a patient split. Classification was performed using the XGBoost classifier, as well as two benchmark classification models using both data splits. The trained models were then used to make predictions on the test data of the remaining datasets. ResultsThe XGBoost model yielded the best performance across all datasets compared to the remaining benchmark models, however variability in model performance was seen across datasets, with accuracy ranging from 70.6% to 89.4%, sensitivity ranging from 61.4% to 76.8%, and specificity ranging from 87.3% to 95.5%. When comparing the results between the patient and the random split, no significant difference was seen in the two private datasets and the Chapman dataset, where the number of samples per patient is low. Nonetheless, in the MIT-BIH dataset, where the average number of samples per patient is approximately 1300, a noticeable disparity was identified. The accuracy, sensitivity, and specificity of the random split in this dataset of 93.6%, 86.4%, and 95.9% respectively, were decreased to 88%, 61.4%, and 89.8% in the patient split, with the largest drop being in Afl sensitivity, from 71% to 5.4%. The inter-dataset scores were also significantly lower than their intra-dataset counterparts across all datasets. ConclusionsCAD systems have great potential in the assistance of physicians in reliable, precise and efficient detection of arrhythmias. However, although compelling research has been done in the field, yielding models with excellent performances on their datasets, we show that these results may be over-optimistic. In our study, we give insight into the difficulty of detection of Afl on several datasets and show the need for a higher representation of Afl in public datasets. Furthermore, we show the necessity of a more structured evaluation of model performance through the use of a patient-based split and inter-dataset testing scheme to avoid the problem of within-subject correlation which may lead to misleadingly high scores. Finally, we stress the need for the creation and use of datasets with a higher number of patients and a more balanced representation of classes if we are to progress in this mission.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.