Abstract

The core objective of privacy preserving data mining is to preserve the confidentiality of individual even after mining. The basic advantage of personalized privacy preservation is that the information loss is very less as compared with other privacy preservation algorithms. These algorithms how ever have not been designed for specific mining algorithms. SW-SDF personalized privacy preservation uses two flags SW and SDF. SW is used for assigning a weight for the sensitive attribute and SDF for sensitive disclosure which is accepted from individual. In this paper we have designed an algorithm which uses SW-SDF personal privacy preservation for data classification. This method ensures privacy and classification of data.

Highlights

  • Privacy preserving data mining (PPDM) is a novel approach in data mining which preserves the privacy of the individual or company related information even after mining process[1,8,9]

  • In k-anonymity[4] each record must have at least k-1 instances of it. This suffers from homogeneity attack in which each block contains same sensitive values. l-diversity [5] requires that each record must have multiple instances such that it contains l well represented sensitive values

  • Frequency distribution of sensitive attribute is shown in table 3

Read more

Summary

INTRODUCTION

Privacy preserving data mining (PPDM) is a novel approach in data mining which preserves the privacy of the individual or company related information even after mining process[1,8,9]. In k-anonymity[4] each record must have at least k-1 instances of it This suffers from homogeneity attack in which each block contains same sensitive values. L-diversity [5] requires that each record must have multiple instances such that it contains l well represented sensitive values The drawback of this approach is that it suffers from Similarity Attack. For example consider patient information having the details pid, pname, page, pzipcode and pdisease In this pid and pname are called identifiers and are removed by data publisher before giving it to data recipient. The classification may include identification of dependent values of page and pzipcode for the pdisease like „heart disease‟. This classification must not reveal the identity of the individual and company related information. The resultant published data can still be used for constructing the data classifier approximately equivalent to that of the original classifier using SW-SDF personal privacy

Related work in privacy preservation
Data classification using ID3
Existing system
Proposed system
Notation for SW-SDF data converter
Experimental results
Conclusion and future work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call