An Introduction to Probabilistic Record Linkage with a Focus on Linkage Processing for WTC Registries.

Jana Asher,Dean Resnick,Jennifer Brite,James Cone,Robert Brackbill

doi:10.3390/ijerph17186937

Abstract

Since its post-World War II inception, the science of record linkage has grown exponentially and is used across industrial, governmental, and academic agencies. The academic fields that rely on record linkage are diverse, ranging from history to public health to demography. In this paper, we introduce the different types of data linkage and give a historical context to their development. We then introduce the three types of underlying models for probabilistic record linkage: Fellegi-Sunter-based methods, machine learning methods, and Bayesian methods. Practical considerations, such as data standardization and privacy concerns, are then discussed. Finally, recommendations are given for organizations developing or maintaining record linkage programs, with an emphasis on organizations measuring long-term complications of disasters, such as 9/11.

Highlights

From its humble beginnings in post-World War II public health research, the field of “record linkage”—that is, the matching of records for unique entitiesacross one or more lists—has exploded into a multi-field research focus
The origins of record linkage as a field begin at the end of World War II; the original papers on record linkage related to family structure in the United States and elsewhere [1,2,3] and a population registry in Canada [4]
Current research topics related to these concerns revolve around privacy-preserving record linkage and understanding the bias introduced by the requirement for informed consent [26,27]

Summary

Introduction

From its humble beginnings in post-World War II public health research, the field of “record linkage”—that is, the matching of records for unique entities (typically people, but sometimes organizations, addresses, or something else)across one or more lists—has exploded into a multi-field research focus (see Figure 1). Several joint studies are being formulated to study pooled patient populations across cohorts. The reasons for this are both scientific and practical. As more data become available electronically and computational power improves, access to health data, at least from a technical point of view, has become easier. This is fortuitous as maintaining a large-scale research project over multiple decades among a trauma-exposed and aging population presents several challenges, chief among them attrition and reporting bias due to failing memories among respondents.

Methods

Data Combining Methods

Ranking

Historical Context

Fellegi-Sunter Model

Machine Learning

Bayesian Record Linkage Techniques

Open Research Questions

Practical Considerations

Data Cleaning and Standardization

Missing Data

Error Measurement

Software

Data Sharing

Documentation of Record Linkage Processes

Privacy Preserving Record Linkage

Biases in the Record Linkage Process

Findings

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International journal of environmental research and public health	Publication Date: Sep 1, 2020
Citations: 18	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

An Introduction to Probabilistic Record Linkage with a Focus on Linkage Processing for WTC Registries.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International journal of environmental research and public health

Lead the way for us

Similar Papers

Anonymous non‐response analysis in the ABCD cohort study enabled by probabilistic record linkage
M Tromp ... G J Bonsel
Paediatric and Perinatal Epidemiology | VOL. 23
M Tromp, et. al.M Tromp ... G J Bonsel
30 Mar 2009
Paediatric and Perinatal Epidemiology | VOL. 23

Deterministic Linkage as a Preceding Filter for Other Record Linkage Methods
M Sariyar ... A Borg
International Journal of Information Technology & Decision Making | VOL. 14
M Sariyar, et. al.M Sariyar ... A Borg
01 May 2015
International Journal of Information Technology & Decision Making | VOL. 14

Beyond probabilistic record linkage: Using neural networks and complex features to improve genealogical record linkage
D Randall Wilson
-
D Randall WilsonD Randall Wilson
01 Jul 2011
01 Jul 2011

Methods for the Estimation of Large Scale Bayesian Models for Record Linkage Under One-to-One Matching

-

29 May 2020
29 May 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Introduction to Probabilistic Record Linkage with a Focus on Linkage Processing for WTC Registries.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International journal of environmental research and public health