Abstract

BackgroundCombining multiple databases with disjunctive or additional information on the same person is occurring increasingly throughout research. If unique identification numbers for these individuals are not available, probabilistic record linkage is used for the identification of matching record pairs. In many applications, identifiers have to be encrypted due to privacy concerns.MethodsA new protocol for privacy-preserving record linkage with encrypted identifiers allowing for errors in identifiers has been developed. The protocol is based on Bloom filters on q-grams of identifiers.ResultsTests on simulated and actual databases yield linkage results comparable to non-encrypted identifiers and superior to results from phonetic encodings.ConclusionWe proposed a protocol for privacy-preserving record linkage with encrypted identifiers allowing for errors in identifiers. Since the protocol can be easily enhanced and has a low computational burden, the protocol might be useful for many applications requiring privacy-preserving record linkage.

Highlights

  • Medical databases of people usually contain identifiers like surnames, given names, date of birth, and address information

  • There are some intriguing approaches proposed in the literature, these have a number of problems, for instance they involve very high computing demands or high rates of false positives or false negatives

  • We suggest the use of Bloom filters for solving this problem

Read more

Summary

Introduction

Since the identifiers agree exactly if their corresponding hash values agree, the third party can link matching records without knowing the identifiers. Variants of this protocol using exact matching have been published [12,13]. Combining multiple databases with disjunctive or additional information on the same person is occurring increasingly throughout medical research. The availability of large medical databases and unique person identifier (ID) numbers has made widespread use of record linkage possible. In many research applications not all databases contain a unique ID number In such situations, probabilistic record linkage is most frequently applied for the identification of matching record pairs [1]. We developed a new procedure which addresses these problems

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.