Abstract

BackgroundRecord linkage is an important tool for epidemiologists and health planners. Record linkage studies will generally contain some level of residual record linkage error, where individual records are either incorrectly marked as belonging to the same individual, or incorrectly marked as belonging to separate individuals. A key question is whether errors in linkage quality are distributed evenly throughout the population, or whether certain subgroups will exhibit higher rates of error. Previous investigations of this issue have typically compared linked and un-linked records, which can conflate bias caused by record linkage error, with bias caused by missing records (data capture errors).MethodsFour large administrative datasets were individually de-duplicated, with results compared to an available ‘gold-standard’ benchmark, allowing us to avoid methodological issues with comparing linked and un-linked records. Results were compared by gender, age, geographic remoteness (major cities, regional or remote) and socioeconomic status.ResultsResults varied between datasets, and by sociodemographic characteristic. The most consistent findings were worse linkage quality for younger individuals (seen in all four datasets) and worse linkage quality for those living in remote areas (seen in three of four datasets). The linkage quality within sociodemographic categories varied between datasets, with the associations with linkage error reversed across different datasets due to quirks of the specific data collection mechanisms and data sharing practices.ConclusionsThese results suggest caution should be taken both when linking younger individuals and those in remote areas, and when analysing linked data from these subgroups. Further research is required to determine the ramifications of worse linkage quality in these subpopulations on research outcomes.

Highlights

  • Record linkage is an important tool for epidemiologists and health planners

  • Each dataset had previously been de-duplicated to a high quality by jurisdictional linkage units

  • There were some differences between states in terms of remoteness, with Western Australia having a much larger proportion of individuals living in remote areas (7%) compared with the other states (0–1%), and New South Wales having a higher proportion of individuals living in regional areas

Read more

Summary

Introduction

Record linkage is an important tool for epidemiologists and health planners. Record linkage studies will generally contain some level of residual record linkage error, where individual records are either incorrectly marked as belonging to the same individual, or incorrectly marked as belonging to separate individuals. A key question is whether errors in linkage quality are distributed evenly throughout the population, or whether certain subgroups will exhibit higher rates of error Previous investigations of this issue have typically compared linked and un-linked records, which can conflate bias caused by record linkage error, with bias caused by missing records (data capture errors). Record linkage is a set of methodologies designed to bring together information relating to the same person from within or across datasets [1]. This technique is widely used for conducting longitudinal observational health research [2]. In a recent study of men with and without HIV, the probabilistic linkage method applied (with estimated sensitivity and specificity of 88.4 and 99.7 respectively) led to a finding of a significantly lower rate of hospitalisation in HIV positive men as compared to the general population (0.46, 0.37–0.58), while improvements in linkage quality revealed the opposite finding; a significantly higher rate of hospitalisation in HIV positive men (1.45, 1.33–1.59) [6]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call