A privacy protection technique for publishing data mining models and research data

Yu Fu,Aryya Gangopadhyay,Gunes Koru,Zhiyuan Chen

doi:10.1145/1877725.1877732

Abstract

Data mining techniques have been widely used in many research disciplines such as medicine, life sciences, and social sciences to extract useful knowledge (such as mining models) from research data. Research data often needs to be published along with the data mining model for verification or reanalysis. However, the privacy of the published data needs to be protected because otherwise the published data is subject to misuse such as linking attacks. Therefore, employing various privacy protection methods becomes necessary. However, these methods only consider privacy protection and do not guarantee that the same mining models can be built from sanitized data. Thus the published models cannot be verified using the sanitized data. This article proposes a technique that not only protects privacy, but also guarantees that the same model, in the form of decision trees or regression trees, can be built from the sanitized data. We have also experimentally shown that other mining techniques can be used to reanalyze the sanitized data. This technique can be used to promote sharing of research data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A privacy protection technique for publishing data mining models and research data

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Management Information Systems

Lead the way for us

Journal: ACM Transactions on Management Information Systems	Publication Date: Dec 1, 2010
Citations: 17

Similar Papers

Domain-Oriented Data-Driven Data Mining (3DM): Simulation of Human Knowledge Understanding
Guoyin Wang
-
Guoyin WangGuoyin Wang
15 Dec 2006
15 Dec 2006

Spatial mapping of the provenance of storm dust: Application of data mining and ensemble modelling
Hamid Gholami ... Adrian L Collins
Atmospheric Research | VOL. 233
Hamid Gholami, et. al.Hamid Gholami ... Adrian L Collins
24 Oct 2019
Atmospheric Research | VOL. 233

3DM: Domain-oriented Data-driven Data Mining
Guoyin Wang ... Yan Wang
Fundamenta Informaticae | VOL. 90
Guoyin Wang, et. al.Guoyin Wang ... Yan Wang
01 Jan 2009
Fundamenta Informaticae | VOL. 90

Application of Neural Network Method to Determine Public Satisfaction Level on Pertalite Fuel
Fitri Rahmadani ... Irmayanti Irmayanti
sinkron | VOL. 8
Fitri Rahmadani, et. al.Fitri Rahmadani ... Irmayanti Irmayanti
04 Aug 2024
sinkron | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A privacy protection technique for publishing data mining models and research data

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Management Information Systems