Abstract

Data mining has been a popular research area for more than a decade. There are several problems associated with data mining. Among them clustering is one of the most interesting problems. However, this problem becomes more challenging when dataset is distributed between different parties and they do not want to share their data. So, in this paper we propose a privacy preserving two party hierarchical clustering algorithm vertically partitioned data set. Each site only learns the final cluster centers, but nothing about the individual’s data.

Highlights

  • Data Mining has been a popular research area for more than a decade because of its ability of efficiently extracting statistics and trends from large sets of data

  • In this paper we propose a privacy preserving two party hierarchical clustering algorithm vertically partitioned data set

  • In this paper a novel approach of hierarchical clustering is given for vertically partitioned data set which considers the privacy factor

Read more

Summary

Introduction

Data Mining has been a popular research area for more than a decade because of its ability of efficiently extracting statistics and trends from large sets of data. When dealing with such sensitive information, the privacy issues become major concerns, as any leakage or compromise of data may result in potential harm to individuals or financial losses to the corporate. It is widely used in the applications of financial affairs, marketing, insurance, medicine, chemistry, machine learning, data mining, etc. Most of the clustering problem solutions for vertically partitioned data set are based on k-means algorithm. In this paper a secure hierarchical clustering approach over vertically partitioned data is provided This hierarchical clustering is more efficient than k-means clustering algorithm in identifying cluster centers.

Distributed Data Mining
Cluster Analysis
Some Basic Privacy Preserving Techniques
Privacy Preserving Clustering Algorithm
Clustering on Vertically Partitioned Data Set
Secure Hierarchical Clustering on Vertically Partitioned Data Set
Closest-cluster
Efficiency and Privacy Analysis
Experimental Results
Conclusions and Future Research

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.