Abstract

While the sharing of data is known to be beneficial in data mining applications and widely acknowledged as advantageous in business, this information sharing can become controversial and thwarted by privacy regulations and other privacy concerns. Data clustering for instance could be more accurate if more information is available, hence the data sharing. Any solution needs to balance the clustering requirements and the privacy issues. Rather than simply hindering data owners from sharing information for data analysis, a solution could be designed to meet privacy requirements and guarantee valid data clustering results. To achieve this dual goal, this article introduces a method for privacy-preserving clustering called dimensionality reduction-based transformation (DRBT). This method relies on the intuition behind random projection to protect the underlying attribute values subjected to cluster analysis. It is shown analytically and empirically that transforming a dataset using DRBT, a data owner can achieve privacy preservation and get accurate clustering with little overhead of communication cost. Such a method presents the following advantages: it is independent of distance-based clustering algorithms, it has a sound mathematical foundation, and it does not require CPU-intensive operations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.