Abstract

Primary care EHR data are often of clinical importance to cohort studies however they require careful handling. Challenges include determining the periods during which EHR data were collected. Participants are typically censored when they deregister from a medical practice, however, cohort studies wish to follow participants longitudinally including those that change practice. Using UK Biobank as an exemplar, we developed methodology to infer continuous periods of data collection and maximize follow-up in longitudinal studies. This resulted in longer follow-up for around 40% of participants with multiple registration records (mean increase of 3.8 years from the first study visit). The approach did not sacrifice phenotyping accuracy when comparing agreement between self-reported and EHR data. A diabetes mellitus case study illustrates how the algorithm supports longitudinal study design and provides further validation. We use UK Biobank data, however, the tools provided can be used for other conditions and studies with minimal alteration.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.