Linkage-Data Linear Regression

Li-Chun Zhang,Tiziana Tuoto

doi:10.1111/rssa.12630

Abstract

AbstractData linkage is increasingly being used to combine data from different sources with the aim of identifying and bringing together records from separate files, which correspond to the same entities. Usually, data linkage is not a trivial procedure and linkage errors, false and missed links, are unavoidable. In these cases, standard statistical techniques may produce misleading inference. In this paper, we propose a method for secondary linear regression analysis, where the linked data have to be prepared by someone else, and neither the match-key variables nor the unlinked records are available to the analyst. We develop also a diagnostic test for the assumption of non-informative linkage errors, which is required for all existing secondary analysis adjustment methods. Our approach provides important advantages: it relies on the realistic assumption that the probabilities of correct linkage vary across the records but it does not assume that one is able to estimate the probability of correct linkage for each individual record. Moreover, it accommodates in a simple manner the general situation where the files are of different sizes and none of them is a subset of another. The proposed methodology of adjustment and testing is studied by simulation and applied to real data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of the Royal Statistical Society Series A: Statistics in Society	Publication Date: Nov 11, 2020
Citations: 9	License type: other-oa

R Discovery Prime

R Discovery Prime

Linkage-Data Linear Regression

Abstract

Talk to us

Similar Papers

More From: Journal of the Royal Statistical Society Series A: Statistics in Society

Lead the way for us

Similar Papers

Ethnic bias in data linkage
Louise Mc Grath-Lone ... Katie Harron
The Lancet Digital Health | VOL. 3
Louise Mc Grath-Lone, et. al.Louise Mc Grath-Lone ... Katie Harron
24 May 2021
The Lancet Digital Health | VOL. 3

Evaluating Linkage Quality of Population-Based Administrative Data for Health Service Research.
Ji-Woo Kim ... Hyojung Choi
Journal of Korean medical science | VOL. 39
Ji-Woo Kim, et. al.Ji-Woo Kim ... Hyojung Choi
01 Jan 2024
Journal of Korean medical science | VOL. 39

Unbiased regression estimation under correlated linkage errors
Gunky Kim ... Raymond Chambers
Stat | VOL. 4
Gunky Kim, et. al.Gunky Kim ... Raymond Chambers
01 Feb 2015
Stat | VOL. 4

Data linkage errors in hospital administrative data when applying a pseudonymisation algorithm to paediatric intensive care records.
Gareth Hagger-Johnson ... Katie Harron
BMJ Open | VOL. 5
Gareth Hagger-Johnson, et. al.Gareth Hagger-Johnson ... Katie Harron
01 Aug 2015
BMJ Open | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Linkage-Data Linear Regression

Abstract

Talk to us

Similar Papers

More From: Journal of the Royal Statistical Society Series A: Statistics in Society