Measuring side effects of rule hiding

R S Jinturkar,S Kolkur

doi:10.1145/1980022.1980128

Abstract

Data mining techniques have been widely used in various applications. However, the misuse of these techniques may lead to the disclosure of sensitive information. Large repositories of data contain sensitive information that must be protected against unauthorized access. The protection of the confidentiality of this information has been a long-term goal for the database security research community and for the government statistical agencies. Recent advances in data mining and machine learning algorithms have increased the disclosure risks that one may encounter when releasing data to outside parties. In this paper, we present approach that modifies few transactions in transaction database to decrease support or confidence of sensitive rules. Since the correlation among rules can make it impossible to achieve this goal, we suggest method to hide the sensitive rules with the reduced number of modified entries. The method presented in this paper hides all the selected sensitive rules, limit the side effects i. e. lost rules from the original database and spurious rules (new rules) generated and measures the count of all the constraints for minimum support threshold (MST) and minimum confidence threshold (MCT) value sets.

Full Text