Abstract

Predicting the incidence of complex chronic conditions such as heart failure is challenging. Deep learning models applied to rich electronic health records may improve prediction but remain unexplainable hampering their wider use in medical practice. We aimed to develop a deep-learning framework for accurate and yet explainable prediction of 6-month incident heart failure (HF). Using 100,071 patients from longitudinal linked electronic health records across the U.K., we applied a novel Transformer-based risk model using all community and hospital diagnoses and medications contextualized within the age and calendar year for each patient's clinical encounter. Feature importance was investigated with an ablation analysis to compare model performance when alternatively removing features and by comparing the variability of temporal representations. A post-hoc perturbation technique was conducted to propagate the changes in the input to the outcome for feature contribution analyses. Our model achieved 0.93 area under the receiver operator curve and 0.69 area under the precision-recall curve on internal 5-fold cross validation and outperformed existing deep learning models. Ablation analysis indicated medication is important for predicting HF risk, calendar year is more important than chronological age, which was further reinforced by temporal variability analysis. Contribution analyses identified risk factors that are closely related to HF. Many of them were consistent with existing knowledge from clinical and epidemiological research but several new associations were revealed which had not been considered in expert-driven risk prediction models. In conclusion, the results highlight that our deep learning model, in addition high predictive performance, can inform data-driven risk factor identification.

Highlights

  • Heart failure (HF) remains a major cause of morbidity, mortality, and economic burden[1]

  • We first investigated disease prevalence and found that 73% of patients >65 with hypertension are treated with antihypertensives while 70% of all diabetic patients are treated with medications for diabetes. Understanding that these diseases frequently contextualise with their respective treatments in older ages, we investigated if these treatments associate strongly with non-heart failure (HF) (RC

  • We show that BEHRT naturally captures untreated risk factors associate strongly with HF while treated risk factors are mitigated in risk due to treatment and appropriately associate lesser with HF

Read more

Summary

Introduction

Heart failure (HF) remains a major cause of morbidity, mortality, and economic burden[1]. The growing availability of comprehensive clinical datasets, such as linked electronic health records (EHR) with extensive clinical information from a large number of individuals, together with advances in machine learning, offer new opportunities for developing more robust risk-prediction models than conventional statistical approaches[4], [5]. Prominent deep learning (DL) architectures have shown modest performance in large-scale, complex EHR datasets[6]–[8] for risk prediction of various conditions including HF Due to their high level of abstraction, these DL models have typically had poor “explainability” or ability to demonstrate results in a language understandable by humans. Explainable DL with rich EHR is still in its nascency; tailoring known methods to improve model explainability in the medical context is crucial

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call