Abstract Background Predicting sudden cardiac death (SCD) in the general population remains a significant challenge, particularly at the individual level. SCD and myocardial infarction (MI) share similar pattern of characteristics and risk factors, making their specific prediction highly difficult. As a result, the current prediction models exhibit low specificity. Methods To estimate the specific risk of SCD over three months, we trained a machine learning model on electronic health record (EHR) data representing 8,566,229 drug prescriptions and 801,352 of hospital diagnoses up to five years prior to SCD. The data were obtained from a French cohort of 12,338 SCD and from a cohort of 12,338 controls from 2011 to 2015. We then validated the results on two external cohorts: one temporal cohort in the same area between 2016 and 2020 with 11,620 SCD cases and 11,620 controls and one geographical cohort from the USA with 892 SCD cases and 892 controls from 2013 to 2021. In order to address the specificity of our prediction model for SCD, SCD cases collected between 2016 and 2020 were also matched with a group of 35 000 controls who had myocardial infarction, with no SCD, following the same protocol as for the controls issued from the general population. Findings: The analysis reveals three distinct patterns of distribution for SCD cases, controls, and MI cases. We observed a semi linear increase of SCD cases over the deciles of predicted risk, and our model detected 24% (2,908) of SCD cases with a predicted risk exceeding 90% (highest decile). We observed the opposite among the controls that decreased linearly with increasing deciles of predicted risk: most of the controls were accurately identified in the lowest deciles, and there were only 214 controls (2%) in the highest decile. Interestingly, we observed a relatively stable distribution of MI cases across the deciles, with 10% of MI cases in the highest decile (Figure 1). Interpretation: We developed and validated a personalized prediction model that is able to identify subjects at high-risk for SCD specifically. Figure 1: Histogram of predicted risks for sudden cardiac death cases, controls and myocardial infarction cases in the validation cohort.Figure 1
Read full abstract