Abstract

BIRCH algorithm, introduced by Zhang et al. [15], is a well known algorithm for effectively finding clusters in a large data set. The two major components of the BIRCH algorithm are CF tree construction and global clustering. However BIRCH algorithm is basically designed as an algorithm working on a single database. We propose the first novel method for running BIRCH over a vertically partitioned data sets, distributed in two different databases in a privacy preserving manner. We first provide efficient solutions to crypto primitives such as finding minimum index in a vector sum and checking if sum of two private values exceed certain threshold limit. We then use these primitives as basic tools to arrive at secure solutions to CF tree construction and single link clustering for implementing BIRCH algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call