Privacy-oriented discovery of interesting pattern from numeric attributes

Z Chen

doi:10.1109/icsmc.2003.1244228

Abstract

The use and dissemination of the sensitive information is one of the major issues causing concern surrounding knowledge discovery. Existing mining algorithm use the discretization method to partition each numeric attribute into a set of interval during data prepossessing phase. However, not only can such method bring the problem of producing many irrelevant and uninteresting patterns, but also the information is disclosed. In this paper, we propose a new framework to address this issue. The new approach first perturbs and transforms the original data set based on a set of different belief level without information loss. After that, the transformed data are sent to the data mining consultancy, then rules under different belief levels are generated. After that, the interesting filter is used to eliminate the redundant rules. Rules are useful only in the context of partition performed by the data provider and there is no information disclosure. The proposed technique has been applied to a number of sensitive real life data sets. Experiments results show that our proposed technique is very effective especially when there are many numeric attributes in the data set.

Full Text