A graph traversal attack on Bloom filter-based medical data aggregation

Ramakrishna Thurimella,Max Roschke,William Mitchell,Rinku Dewri

doi:10.1504/ijbdi.2017.10006842

Abstract

We present a novel cryptanalytic method based on graph traversals to show that record linkage using Bloom filter encoding does not preserve privacy in a two-party setting. Bloom filter encoding is often suggested as a practical approach to medical data aggregation. This attack is stronger than a simple dictionary attack in that it does not assume knowledge of the universe. The attack is very practical and produced accurate results when experimented on large amounts of name-like data derived from a North Carolina voter registration database. We also give theoretical arguments that show that going from bigrams to n-grams, n > 2, does not increase privacy; on the contrary, it actually makes the attack more effective. Finally, some ways to resist this attack are suggested.

Full Text