Abstract
Problem statement: The objective of the hybrid algorithm for privacy p reserving data mining is to hide certain sensitive information so that they cannot be discovered through association rule mining techniques. Approach: The sensitive items whether in Left Hand Side (LHS) or Right Hand Side (RHS) of the rule cannot be inferred thro ugh association rule mining algorithms by combining the concept of Increase Support of Left H and Side (ISL) and Decrease Support of Right Hand Side (DSR) algorithms i.e., by increasing and decreasing the support of the LHS and RHS item of the rule respectively. Results: The efficiency of the proposed approach is compared with alone Increase Support of Left Hand Side (ISL) approach f or real databases on the basis of number of rules pruned. Conclusion: The hybrid approach of ISL and DSR algorithms prunes more number of sensitive rules with same number of database scans.
Highlights
Privacy preserving data mining is a novel research direction in data mining and statistical databases where data mining algorithms are analyzed for the side-effects they incur in data privacy (Evfimievski et al, 2002)
To decrease support of an item, we will modify one item at a time by changing from 1 to 0 or from 0 to 1 in a selected transaction. Based on these two strategies, we propose one data-mining algorithm for hiding sensitive items in association rules called hybrid algorithm. This algorithm first tries to hide the rules in which item to be hidden i.e., X is in right hand side and tries to hide the rules in which X is in left hand side
For this algorithm t is a transaction, T is a set of transactions, U is used for rule, RHS(U) is Right Hand Side of rule U, Left Hand Side (LHS)(U) is the right hand side of the rule U, Confidence(U) is the confidence of the rule U
Summary
Privacy preserving data mining is a novel research direction in data mining and statistical databases where data mining algorithms are analyzed for the side-effects they incur in data privacy (Evfimievski et al, 2002). The first approach is to alter the data before delivery to the data miner so that real values are obscured One technique of this approach is to selectively modify individual values from a database to prevent the discovery of a set of rules. They apply a group of heuristic solutions for reducing the number of occurrences (support) of some frequent (large) item sets below a minimum user specified threshold (Liu et al, 2008; Yang et al, 2005). The second type of privacy is that the data is manipulated so that the mining result is not affected or minimally affected
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.