Abstract

ObjectivesWe aimed to create a ‘multidatabase’ algorithm for identification of cholestatic liver injury using multiple linked UK databases, before (1) assessing the improvement in case ascertainment compared to using a single database and (2) developing a new single-database case-definition algorithm, validated against the multidatabase algorithm.DesignMethod development for case ascertainment.SettingThree UK population-based electronic health record databases: the UK Clinical Practice Research Datalink (CPRD), the UK Hospital Episodes Statistics (HES) database and the UK Office of National Statistics (ONS) mortality database.Participants16 040 people over the age of 18 years with linked CPRD–HES records indicating potential cholestatic liver injury between 1 January 2000 and 1 January 2013.Primary outcome measures(1) The number of cases of cholestatic liver injury detected by the multidatabase algorithm. (2) The relative contribution of each data source to multidatabase case status. (3) The ability of the new single-database algorithm to discriminate multidatabase algorithm case status.ResultsWithin the multidatabase case identification algorithm, 4033 of 16 040 potential cases (25%) were identified as definite cases based on CPRD data. HES data allowed possible cases to be discriminated from unlikely cases (947 of 16 040, 6%), but only facilitated identification of 1 definite case. ONS data did not contribute to case definition. The new single-database (CPRD-only) algorithm had a very good ability to discriminate multidatabase case status (area under the receiver operator characteristic curve 0.95).ConclusionsCPRD–HES–ONS linkage confers minimal improvement in cholestatic liver injury case ascertainment compared to using CPRD data alone, and a multidatabase algorithm provides little additional information for validation of a CPRD-only algorithm. The availability of laboratory test results within CPRD but not HES means that algorithms based on CPRD–HES-linked data may not always be merited for studies of liver injury, or for other outcomes relying primarily on laboratory test results.

Highlights

  • Electronic health records stored within very large population-based primary and secondary care databases are an increasingly important research resource internationally

  • Clinical Practice Research Datalink (CPRD)–Hospital Episodes Statistics (HES)–ONS linkage confers minimal improvement in cholestatic liver injury case ascertainment compared to using CPRD data alone, and a multidatabase algorithm provides little additional information for validation of a CPRD-only algorithm

  • CPRD algorithm development Univariable and multivariable analysis Liver test result status was shown to perfectly predict multidatabase case status, that is, all of those with CPRD cholestatic liver test results were classified as cases, while no individuals with an index diagnosis of cholaemia were classified as cases

Read more

Summary

Introduction

Electronic health records stored within very large population-based primary and secondary care databases are an increasingly important research resource internationally. These are longitudinal records, capturing information generated as part of routine clinical care.[1] A record for an individual patient will include anonymised information on demographics, diagnoses, prescriptions and referrals. Critical for epidemiological studies and active case detection is the ability to accurately identify outcomes. This is often challenging within these databases, where the information has been entered as part of routine clinical care, and not for the purpose of a specific study.

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call