Abstract
BackgroundWhile the importance of record linkage is widely recognised, few studies have attempted to quantify how linkage errors may have impacted on their own findings and outcomes. Even where authors of linkage studies have attempted to estimate sensitivity and specificity based on subjects with known status, the effects of false negatives and positives on event rates and estimates of effect are not often described.MethodsWe present quantification of the effect of sensitivity and specificity of the linkage process on event rates and incidence, as well as the resultant effect on relative risks. Formulae to estimate the true number of events and estimated relative risk adjusted for given linkage sensitivity and specificity are then derived and applied to data from a prisoner mortality study. The implications of false positive and false negative matches are also discussed.DiscussionComparisons of the effect of sensitivity and specificity on incidence and relative risks indicate that it is more important for linkages to be highly specific than sensitive, particularly if true incidence rates are low. We would recommend that, where possible, some quantitative estimates of the sensitivity and specificity of the linkage process be performed, allowing the effect of these quantities on observed results to be assessed.
Highlights
Record linkage is the task of bringing together information from two or more different sources that pertain to the same individual
The purpose of this paper is to provide further assistance to researchers who are attempting to appraise the possible impact of errors in the linkage process
Results of epidemiological studies using outcomes determined by linked datasets are affected by errors in linkage
Summary
Record linkage is the task of bringing together information from two or more different sources that pertain to the same individual. Record linkage studies are constrained by the quality of the datasets being linked and by the methods of linkage used [11,12,13]. Linkage can be more complex if the amount or quality of identifying data for individuals are limited. In these cases, probabilistic linkage methods have become widely used [14]. While the importance of record linkage is widely recognised, few studies have attempted to quantify how linkage errors may have impacted on their own findings and outcomes. Even where authors of linkage studies have attempted to estimate sensitivity and specificity based on subjects with known status, the effects of false negatives and positives on event rates and estimates of effect are not often described
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.