Abstract Background Coronary artery disease (CAD) often goes undetected for years before it manifests and results in substantial morbidity and mortality. Some patients have neither typical risk factors nor symptomatic angina despite progressive disease. Purpose Develop a deep-learning model to detect prevalent CAD and identify people at risk for adverse events using electrocardiograms (ECG) in a primary care setting. Methods We developed a convolutional neural network using 12-lead ECG waveforms to discriminate the presence of CAD defined using diagnostic codes ("ECG2CAD"). ECG2CAD was trained on 764,670 ECGs from 137,199 individuals at the Massachusetts General Hospital (MGH). Model performance for discrimination of prevalent CAD was measured using AUROC and AUPRC, and compared against a model comprising age and sex, as well as the Pooled Cohort Equations (PCE), in three test sets independent of model training: MGH, Brigham and Women’s Hospital (BWH) and UK Biobank. ECG2CAD was assessed across subgroups of age, sex and self-reported ethnicity and for incident CAD-related events in the BWH primary care subset. Results ECG2CAD was evaluated in MGH (N=18,706 [N=6,051 cases], age 57±16 years), BWH (N=88,270 [N=27,898 cases], age 57±16 years), and UK Biobank (N=42,147 [N=1,509 cases], age 65±8 years). ECG2CAD consistently discriminated prevalent CAD (MGH: AUROC 0.782, AUPRC 0.639; BWH: AUROC 0.747, AUPRC 0.588; UK Biobank: AUROC 0.760, AUPRC 0.155) and incrementally improved upon both a model based on age and sex (p<0.01) and the PCE (p<0.01) in MGH and BWH. In the BWH primary care subset (N=51,808), model performance was consistent in subgroups defined by age, sex and self-reported ethnicity/race. The incorporation of ECG2CAD in addition to age and sex consistently improved discrimination (AUROC delta of 0.011-0.086) in all subgroups. In the BWH primary care subset with over 10 years of follow-up, we observed that either the presence of ECG2CAD-predicted CAD (HR 2.18, 95% CI 2.00-2.38, P<0.001) and ICD-based known CAD (HR 2.13, 95% CI 1.86-2.45, P<0.001) were associated with increased risk for incident myocardial infarction. The presence of both definitions was associated with an almost five-fold increase in risk (HR 4.97, 95% CI 4.51-5.48, P<0.001). The risk of incident adverse events was highest among people in the top quintile of ECG-based predicted risk (high risk) compared with people from quintile 4-2 (intermediate risk) or the lowest quintile (low risk), even when excluding those with known prevalent CAD at baseline. Conclusions Artificial intelligence-enabled analysis of the 12-lead ECG may facilitate efficient identification of individuals with possible undiagnosed CAD and inform downstream testing and preventive measures.
Read full abstract