Abstract Background Subtle, prognostically-meaningful ECG features may not be apparent to physicians. In the course of supervised machine learning (ML) training, many thousands of ECG features are identified. These are not limited to conventional ECG parameters and morphology. Purpose To investigate novel neural network (NN)-derived ECG features, that may have clinical, phenotypic and genotypic associations and prognostic significance. Methods We extracted 5120 NN-derived ECG features from an AI-ECG model trained for six simple diagnoses and applied unsupervised machine learning to identify three phenogroups. The derivation set, the Clinical Outcomes in Digital Electrocardiography (CODE) cohort (n = 1,558,421), is a database of ECGs recorded in primary care in Brazil. There were four external validation cohorts. A cohort of British civil servants (WH II, n = 5,066). A longitudinal study of volunteers in the UK (UK Biobank, n = 42,386). A longitudinal cohort of Brazilian public servants (ELSA-Brasil, n = 13,739). Lastly, a cohort of patients with chronic Chagas cardiomyopathy (SaMi-Trop, n = 1,631) . Results In the derivation cohort (CODE), the three phenogroups had significantly different mortality profiles (Figure 1). After adjusting for known covariates, phenogroup B had a 1.2-fold increase in long-term mortality compared to phenogroup A (HR 1.20, 95% CI 1.17-1.23, p < 0.0001). We externally validated our findings in four diverse cohorts. Phenogroup C was poorly represented in the volunteer cohorts and therefore was excluded from those analyses. We found phenogroup B had a significantly greater risk of mortality in all cohorts (Figure 1). We performed a phenome-wide association study (PheWAS) in the UK Biobank. We found ECG phenogroup significantly associated with cardiac and non-cardiac phenotypes, including cardiac chamber volumes and cardiac output (Figure 2A). A single-trait genome-wide association study (GWAS) was conducted. The GWAS yielded four loci (Figure 2B). SCN10A, SCN5A and CAV1 have well described roles in cardiac conduction and arrhythmia. ARHGAP24 has been previously associated with ECG parameters, however, our analysis has identified for the first time ARHGAP24 as a gene associated with a prognostically significant phenogroup. Mendelian randomisation demonstrated the higher risk ECG phenogroup was causally associated with higher odds of atrioventricular (AV) block but lower odds of atrial fibrillation and ischaemic heart disease. Conclusion NN-derived ECG features have important applications beyond the original model from which they are derived and may be transferable and applicable for risk prediction in a wide range of settings, in addition to mortality prediction. We have shown the significant potential of NN-derived ECG features, as a highly transferable and potentially universal risk marker, that may be applied to a wide range of clinical contexts.
Read full abstract