Abstract

Hierarchical clustering, a common 'unsupervised' machine-learning algorithm, is advantageous for exploring potential underlying aetiology in particularly heterogeneous diseases. We investigated potential embolic sources in embolic stroke of undetermined source (ESUS) using a data-driven machine-learning method, and explored variation in stroke recurrence between clusters. We used a hierarchical k-means clustering algorithm on patients' baseline data, which assigned each individual into a unique clustering group, using a minimum-variance method to calculate the similarity between ESUS patients based on all baseline features. Potential embolic sources were categorised into atrial cardiopathy, atrial fibrillation, arterial disease, left ventricular disease, cardiac valvulopathy, patent foramen ovale (PFO) and cancer. Among 800 consecutive ESUS patients (43.3% women, median age 67years), the optimal number of clusters was four. Left ventricular disease was most prevalent in cluster 1 (present in all patients) and perfectly associated with cluster 1. PFO was most prevalent in cluster 2 (38.9% of patients) and associated significantly with increased likelihood of cluster 2 [adjusted odds ratio: 2.69, 95% confidence interval (CI): 1.64-4.41]. Arterial disease was most prevalent in cluster 3 (57.7%) and associated with increased likelihood of cluster 3 (adjusted odds ratio: 2.21, 95% CI: 1.43-3.13). Atrial cardiopathy was most prevalent in cluster 4 (100%) and perfectly associated with cluster 4. Cluster 3 was the largest cluster involving 53.7% of patients. Atrial fibrillation was not significantly associated with any cluster. This data-driven machine-learning analysis identified four clusters of ESUS that were strongly associated with arterial disease, atrial cardiopathy, PFO and left ventricular disease, respectively. More than half of the patients were assigned to the cluster associated with arterial disease.

Highlights

  • Atrial fibrillation was not significantly associated with any cluster. This data-driven machine-learning analysis identified 4 clusters of embolic stroke of undetermined source (ESUS) which were strongly associated with arterial disease, atrial cardiopathy, patent foramen ovale (PFO) and left ventricular disease respectively

  • 17% of all ischemic stroke patients have an embolic stroke of undetermined source (ESUS), i.e. a stroke without an apparent cause despite recommended diagnostic workup[1]

  • Numerous underlying pathologies may serve as embolic sources in patients with ESUS like atherosclerotic plaques in the carotids and the aortic arch, covert atrial fibrillation (AF), patent foramen ovale (PFO), left ventricular disease, atrial cardiopathy, cancer and cardiac valvular disease[1]

Read more

Summary

Introduction

17% of all ischemic stroke patients have an embolic stroke of undetermined source (ESUS), i.e. a stroke without an apparent cause despite recommended diagnostic workup[1]. The results are reproducible and this process is fixed once clusters are assigned, so participants cannot be reclassified into a different cluster This contrasts with standard regression methods, which is used to identify associations between response and explanatory variables. This belongs to “supervised” learning which can be used for multiple testing to determine significant differences between groups, which need to be specified a priori. [5] This process, is extremely advantageous for exploring the potential underlying aetiology in heterogeneous diseases, like ESUS. Hierarchical clustering, a common “unsupervised” machine-learning algorithm, is advantageous for exploring potential underlying aetiology in heterogeneous diseases. We investigated potential embolic sources in ESUS using a data-driven, machinelearning method, and explored variation in stroke recurrence between clusters

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.