Top-k frequent itemsets via differentially private FP-trees

Jaewoo Lee,Christopher W Clifton

doi:10.1145/2623330.2623723

Abstract

Frequent itemset mining is a core data mining task and has been studied extensively. Although by their nature, frequent itemsets are aggregates over many individuals and would not seem to pose a privacy threat, an attacker with strong background information can learn private individual information from frequent itemsets. This has lead to differentially private frequent itemset mining, which protects privacy by giving inexact answers. We give an approach that first identifies top-k frequent itemsets, then uses them to construct a compact, differentially private FP-tree. Once the noisy FP-tree is built, the (privatized) support of all frequent itemsets can be derived from it without access to the original data. Experimental results show that the proposed algorithm gives substantially better results than prior approaches, especially for high levels of privacy.

Full Text