Abstract
The core objective of privacy preserving data mining is to preserve the confidentiality of individual even after mining. The basic advantage of personalized privacy preservation is that the information loss is very less as compared with other privacy preservation algorithms. These algorithms how ever have not been designed for specific mining algorithms. SW-SDF personalized privacy preservation uses two flags SW and SDF. SW is used for assigning a weight for the sensitive attribute and SDF for sensitive disclosure which is accepted from individual. In this paper we have designed an algorithm which uses SW-SDF personal privacy preservation for data classification. This method ensures privacy and classification of data.
Highlights
Privacy preserving data mining (PPDM) is a novel approach in data mining which preserves the privacy of the individual or company related information even after mining process[1,8,9]
In k-anonymity[4] each record must have at least k-1 instances of it. This suffers from homogeneity attack in which each block contains same sensitive values. l-diversity [5] requires that each record must have multiple instances such that it contains l well represented sensitive values
Frequency distribution of sensitive attribute is shown in table 3
Summary
Privacy preserving data mining (PPDM) is a novel approach in data mining which preserves the privacy of the individual or company related information even after mining process[1,8,9]. In k-anonymity[4] each record must have at least k-1 instances of it This suffers from homogeneity attack in which each block contains same sensitive values. L-diversity [5] requires that each record must have multiple instances such that it contains l well represented sensitive values The drawback of this approach is that it suffers from Similarity Attack. For example consider patient information having the details pid, pname, page, pzipcode and pdisease In this pid and pname are called identifiers and are removed by data publisher before giving it to data recipient. The classification may include identification of dependent values of page and pzipcode for the pdisease like „heart disease‟. This classification must not reveal the identity of the individual and company related information. The resultant published data can still be used for constructing the data classifier approximately equivalent to that of the original classifier using SW-SDF personal privacy
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have