Differentially Private Frequent Sequence Mining via Sampling-based Candidate Pruning.

Shengzhi Xu,Zhengyi Li,Li Xiong,Sen Su,Xiang Cheng

doi:10.1109/icde.2015.7113354

Abstract

In this paper, we study the problem of mining frequent sequences under the rigorous differential privacy model. We explore the possibility of designing a differentially private frequent sequence mining (FSM) algorithm which can achieve both high data utility and a high degree of privacy. We found, in differentially private FSM, the amount of required noise is proportionate to the number of candidate sequences. If we could effectively reduce the number of unpromising candidate sequences, the utility and privacy tradeoff can be significantly improved. To this end, by leveraging a sampling-based candidate pruning technique, we propose a novel differentially private FSM algorithm, which is referred to as PFS2. The core of our algorithm is to utilize sample databases to further prune the candidate sequences generated based on the downward closure property. In particular, we use the noisy local support of candidate sequences in the sample databases to estimate which sequences are potentially frequent. To improve the accuracy of such private estimations, a sequence shrinking method is proposed to enforce the length constraint on the sample databases. Moreover, to decrease the probability of misestimating frequent sequences as infrequent, a threshold relaxation method is proposed to relax the user-specified threshold for the sample databases. Through formal privacy analysis, we show that our PFS2 algorithm is ε-differentially private. Extensive experiments on real datasets illustrate that our PFS2 algorithm can privately find frequent sequences with high accuracy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Differentially Private Frequent Sequence Mining via Sampling-based Candidate Pruning.

Abstract

Talk to us

Similar Papers

More From: Proceedings. International Conference on Data Engineering

Lead the way for us

Journal: Proceedings. International Conference on Data Engineering	Publication Date: Apr 1, 2015
Citations: 43

Similar Papers

Differentially Private Frequent Sequence Mining.
Shengzhi Xu ... Xiang Cheng
IEEE Transactions on Knowledge and Data Engineering | VOL. 28
Shengzhi Xu, et. al.Shengzhi Xu ... Xiang Cheng
01 Nov 2016
IEEE Transactions on Knowledge and Data Engineering | VOL. 28

Efficient algorithms for mining frequent high utility sequences with constraints
Tin Truong ... Hamido Fujita
Information Sciences | VOL. 568
Tin Truong, et. al.Tin Truong ... Hamido Fujita
02 Feb 2021
Information Sciences | VOL. 568

SPaMi-FTS: An Efficient Algorithm for Mining Frequent Sequential Patterns
José Kadir Febrer-Hernández ... Raudel Hernández-León
-
José Kadir Febrer-Hernández, et. al.José Kadir Febrer-Hernández ... Raudel Hernández-León
01 Jan 2014
01 Jan 2014

Local differential privacy-based frequent sequence mining
Teng Wang ... Zhi Hu
Journal of King Saud University - Computer and Information Sciences | VOL. 34
Teng Wang, et. al.Teng Wang ... Zhi Hu
25 Apr 2022
Journal of King Saud University - Computer and Information Sciences | VOL. 34

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Differentially Private Frequent Sequence Mining via Sampling-based Candidate Pruning.

Abstract

Talk to us

Similar Papers

More From: Proceedings. International Conference on Data Engineering