Abstract
Privacy-preservation has emerged to be a major concern in devising a data mining system. But, protecting the privacy of data mining input does not guarantee a privacy-preserved output. This paper focuses on preserving the privacy of data mining output and particularly the output of classification task. Further, instead of static datasets, we consider the classification of continuously arriving data streams: a rapidly growing research area. Due to the challenges of data stream classification such as vast volume, a mixture of labeled and unlabeled instances throughout the stream and timely classifier publication, enforcing privacy-preservation techniques becomes even more challenging. In order to achieve this goal, we propose a systematic method for preserving output-privacy in data stream classification that addresses several applications like loan approval, credit card fraud detection, disease outbreak or biological attack detection. Specifically, we propose an algorithm named Diverse and k-Anonymized HOeffding Tree (DAHOT) that is an amalgamation of popular data stream classification algorithm Hoeffding tree and a variant of k-anonymity and l-diversity principles. The empirical results on real and synthetic data streams verify the effectiveness of DAHOT as compared to its bedrock Hoeffding tree and two other techniques, one that learns sanitized decision trees from sampled data stream and other technique that uses ensemble-based classification. DAHOT guarantees to preserve the private patterns while classifying the data streams accurately.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.