Abstract

Abstract Background Ischaemic heart disease (IHD) is a diverse phenotype conventionally characterized by presence of the cardinal symptom, angina pectoris (fig. 1). Evidence of scientific value in electronic health records (EHRs) is accumulating although point of care applications are still limited. In this context, we have curated an EHR database (BTH) containing population-wide medical history of 2,658,323 patients from years 2006–2016 for discovery of new risk stratifying principles, by linkage of phenotypic data from EHRs, genotypes from 107,690 cardiovascular patients and 40 years of population-wide registry data. Purpose To establish a model that defines IHD onset, categorizes IHD patients by degree of disease progression and then discover phenotypic and genetic features that demarcate these patient strata. Features may be combinations of e.g. symptoms, vital signs, diagnosis and genetic variants that reduce or increase risk of disease progression, factor each other out or enhance the effect of one another, interacting non-linearly. Methods In this retrospective cohort study, cases were defined by integrated assessment of The Danish National Patient Registry (NPR) and BTH. Inclusion criteria were patients with an entry in BTH who were: (i) subject to a coronary arteriography (CAG), cardiac CT or MRI where IHD was the action diagnosis or (ii) subject to CAG, cardiac CT or MRI and diagnosed with IHD within one month. EHRs covering the two criteria were available for all cases and facilitated longitudinal alignment of patients with respect to disease onset and progression. IHD was assessed by lookup in NPR of ICD-10 codes I20-I25 and Danish Health Care Classification codes UXAC85, UXCC00A and UXMC80. Vital signs and symptoms were extracted from the free text in EHRs by application of regular expressions (regex) and named entity recognition using controlled vocabularies, respectively. Results In the dataset, 78,896 patients (50,761 males) met the inclusion criteria. The inclusion criteria aligned patients longitudinally facilitating clustering into subgroups displaying different progression patterns. By application of regex to the free text in the EHR, systolic and diastolic blood pressure were identified in 87% of cases. The controlled vocabulary identified about 60,000 patients, who had chest pain recorded in the EHR at time of diagnosis. Ongoing work is centered on optimizing clustering strategies and subsequently performing comorbidity and biochemical enrichment analysis. Genetic enrichment will be performed on 21,994 cases. Conclusion Integrated assessment of various population-wide registries is a promising strategy to curate EHR for obtainment of point of care applications, here exemplified by IHD. We argue that genetically enriched EHRs have potential to be a key element in efforts to obtain a clinical classification system that concertedly reflects etiology and clinically actionable differences in disease progression patterns. Funding Acknowledgement Type of funding source: Foundation. Main funding source(s): Novo Nordisk Foundation and Innovationsfonden

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call