Abstract

Problem statement: The objective of the hybrid algorithm for privacy preserving data mining is to hide certain sensitive information so that they cannot be discovered through association rule mining techniques. Approach: The sensitive items whether in Left Hand Side (LHS) or Right Hand Side (RHS) of the rule cannot be inferred through association rule mining algorithms by combining the concept of Increase Support of Left Hand Side (ISL) and Decrease Support of Right Hand Side (DSR) algorithms i.e., by increasing and decreasing the support of the LHS and RHS item of the rule respectively. Results: The efficiency of the proposed approach is compared with alone Increase Support of Left Hand Side (ISL) approach for real databases on the basis of number of rules pruned. Conclusion: The hybrid approach of ISL and DSR algorithms prunes more number of sensitive rules with same number of database scans.

Highlights

  • Privacy preserving data mining is a novel research direction in data mining and statistical databases where data mining algorithms are analyzed for the side-effects they incur in data privacy (Evfimievski et al, 2002)

  • To decrease support of an item, we will modify one item at a time by changing from 1 to 0 or from 0 to 1 in a selected transaction. Based on these two strategies, we propose one data-mining algorithm for hiding sensitive items in association rules called hybrid algorithm. This algorithm first tries to hide the rules in which item to be hidden i.e., X is in right hand side and tries to hide the rules in which X is in left hand side

  • For this algorithm t is a transaction, T is a set of transactions, U is used for rule, RHS(U) is Right Hand Side of rule U, Left Hand Side (LHS)(U) is the right hand side of the rule U, Confidence(U) is the confidence of the rule U

Read more

Summary

Introduction

Privacy preserving data mining is a novel research direction in data mining and statistical databases where data mining algorithms are analyzed for the side-effects they incur in data privacy (Evfimievski et al, 2002). The first approach is to alter the data before delivery to the data miner so that real values are obscured One technique of this approach is to selectively modify individual values from a database to prevent the discovery of a set of rules. They apply a group of heuristic solutions for reducing the number of occurrences (support) of some frequent (large) item sets below a minimum user specified threshold (Liu et al, 2008; Yang et al, 2005). The second type of privacy is that the data is manipulated so that the mining result is not affected or minimally affected

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call