Evaluating hardening techniques against cryptanalysis attacks on Bloom filter encodings for record linkage

Thilina Ranbaduge,Anushka Vidanage,Peter Christen,Sirintra Vaiwsri,Rainer Schnell

doi:10.23889/ijpds.v3i4.725

Abstract

IntroductionDue to privacy concerns personal identifiers used for linking data often have to be encoded (masked) before being linked across organisations. Bloom filter (BF) encoding is a popular privacy technique that is now employed in real-world linkage applications. Recent research has however shown that BFs are vulnerable to cryptanalysis attacks. Objectives and ApproachAttacks on BFs either exploit that encoding frequent plain-text values (such as common names) results in corresponding frequent BFs, or they apply pattern mining to identify co-occurring BF bit positions that correspond to frequent encoded q-grams (sub-strings). In this study we empirically evaluated the privacy of individuals encoded in BFs against two recent cryptanalysis attack methods by Christen et al. (2017/2018). We used two snapshots of the North Carolina Voter Registration database for our evaluation, where pairs of records corresponding to the same voter (with name or address variations) resulted in files with 222,251 BFs and 224,061 plain-text records, respectively. ResultsWe encoded between two and four of the fields first and last name, street, and city into one BF per record. For combinations of three and four fields all plain-text values and BFs were unique, challenging any frequency-based attack. For hardening BFs, different suggested methods (balancing, random hashing, XOR, BLIP, and salting) were applied. Without any hardening applied up to 20.7% and 5% of plain-text values were correctly re-identified as 1-to-1 matches by both the pattern-mining and frequency-based attack methods, respectively. No more than 5\% correct 1-to-1 re-identification matches were achieved with the frequency-based attack on BFs encoding two fields when either balancing, random hashing, or XOR folding was applied; while the pattern-mining based attack was not successful in any correct re-identifications for any hardening technique. Conclusion/ImplicationsGiven that BF encoding is now being employed in real-world linkage applications, it is important to study the limits of this privacy technique. Our experimental evaluation shows that although basic BFs without hardening technique are susceptible to cryptanalysis attacks, some hardening techniques are able to protect BFs against these attacks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Evaluating hardening techniques against cryptanalysis attacks on Bloom filter encodings for record linkage

Abstract

Talk to us

Similar Papers

More From: International Journal of Population Data Science

Lead the way for us

Journal: International Journal of Population Data Science	Publication Date: Aug 28, 2018
License type: CC BY-NC-ND 4.0

Similar Papers

Precise and Fast Cryptanalysis for Bloom Filter Based Privacy-Preserving Record Linkage
Peter Christen ... Rainer Schnell
IEEE Transactions on Knowledge and Data Engineering | VOL. 31
Peter Christen, et. al.Peter Christen ... Rainer Schnell
01 Nov 2019
IEEE Transactions on Knowledge and Data Engineering | VOL. 31

Pattern-Mining Based Cryptanalysis of Bloom Filters for Privacy-Preserving Record Linkage
Peter Christen ... Anushka Vidanage
-
Peter Christen, et. al.Peter Christen ... Anushka Vidanage
01 Jan 2018
01 Jan 2018

Efficient Pattern Mining Based Cryptanalysis for Privacy-Preserving Record Linkage
Anushka Vidanage ... Thilina Ranbaduge
-
Anushka Vidanage, et. al.Anushka Vidanage ... Thilina Ranbaduge
01 Apr 2019
01 Apr 2019

Reference Values Based Hardening for Bloom Filters Based Privacy-Preserving Record Linkage
Sirintra Vaiwsri ... Peter Christen
-
Sirintra Vaiwsri, et. al.Sirintra Vaiwsri ... Peter Christen
01 Jan 2019
01 Jan 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluating hardening techniques against cryptanalysis attacks on Bloom filter encodings for record linkage

Abstract

Talk to us

Similar Papers

More From: International Journal of Population Data Science