Abstract

We present two new clustering algorithms called k-sets and k-swaps for data where each object is a set. First, we define the mean of the sets in a cluster, and the distance between a set and the mean. We then derive the k-sets algorithm from the principles of classical k-means so that it repeats the assignment and update steps until convergence. To the best of our knowledge, the proposed algorithm is the first k-means based algorithm for this kind of data. We adopt the idea also into random swap algorithm, which is a wrapper around the k-means that avoids local minima. This variant is called k-swaps. We show by experiments that this algorithm provides more accurate clustering results than k-medoids and other competitive methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call