Abstract

Data processing techniques and the growth of the internet have resulted in a data explosion. The data that are now available may contain sensitive information that could, if misused, jeopardise the privacy of individuals. In today’s web world, the privacy of personal and personal business information is a growing concern for individuals, corporate entities and governments. Preserving personal and sensitive information is critical to the success of today’s data mining techniques. Preserving the privacy of data is even more crucial in critical sectors such as defence, health care and finance. Privacy Preserving Data Mining (PPDM) addresses such issues by balancing the preservation of privacy and the utilisation of data.Traditionally, Geometrical Data Transformation Methods (GDTMs) have been widely used for privacy preserving clustering. The drawback of these methods is that geometric transformation functions are invertible, which results in a lower level of privacy protection. In this work, a Principal Component Analysis (PCA)-based technique that preserves the privacy of sensitive information in a multi-party clustering scenario is proposed. The performance of this technique is evaluated further by applying a classical K-means clustering algorithm, as well as a machine learning-based clustering method on synthetic and real world datasets. The accuracy of clustering is computed before and after privacy-preserving transformation. The proposed PCA-based transformation method resulted in superior privacy protection and better performance when compared to the traditional GDTMs.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.