Abstract

Most clustering algorithms organize a collection of objects into a set of disjoint clusters. Although this approach has been successfully applied in unsupervised learning, there are several applications where objects could belong to more than one cluster. Overlapping clustering is an alternative in those contexts like social network analysis, information retrieval and bioinformatics, among other problems where non-disjoint clusters appear. In addition, there are environments where the collection changes frequently and the clustering must be updated; however, most of the existing overlapping clustering algorithms are not able to efficiently update the clustering. In this paper, we introduce a new overlapping clustering algorithm, called DClustR, which is based on the graph theory approach and it introduces a new strategy for building more accurate overlapping clusters than those built by state-of-the-art algorithms. Moreover, our algorithm introduces a new strategy for efficiently updating the clustering when the collection changes. The experimentation conducted over several standard collections shows the good performance of the proposed algorithm, wrt. accuracy and efficiency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call