Abstract

In recent years, rare pattern mining has shown great vitality in some real-world fields, such as disease diagnosis, criminal behavior analysis, anomaly detection in networks, and so on. When data organizations publish or share information publicly, shared data can be at risk of leakage as data mining techniques may discover sensitive knowledge and information. To keep competitors from obtaining hidden information after processing the database, privacy-preserving data mining (PPDM) has been proposed and studied widely. However, most of the techniques in PPDM are applied to frequent pattern mining and cannot deal with the privacy protection problems in rare pattern mining, such as network vulnerability detection and abnormal medical data. To address this limitation, we introduce a privacy-preserving technique for rare pattern mining. In this paper, two novel algorithms named Longest Transaction-Minimum Item Number (LT-MIN) and Longest Transaction-Maximum Item Number (LT-MAX) are proposed to hide sensitive rare itemsets and return the sanitized database. These two algorithms succeed in hiding target itemsets while minimizing the side effects on the original database. What's more, they employ a projection mechanism to reduce the time spent scanning the database. Besides using the traditional evaluation criteria in PPDM, we also propose two additional similarity measures to evaluate the performance from the perspective of the itemsets and the structural integrity of the database. The experimental results indicate that the proposed algorithms can hide sensitive rare itemsets successfully and efficiently, and the evaluation methods used can become the evaluation criteria for privacy-preserving rare itemset mining (PPRIM).

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.