Abstract

This study compares three benchmark clustering methods—mini batch k-means, DBSCAN, and spectral clustering—with regular decomposition (RD), a new method developed for large graph data. RD is first converted so that applicable to numerical data without graph structure by changing the input into a distance matrix and the output into cluster labels. The results indicate that mini batch k-means has the best overall performance in terms of accuracy, time, and space consumption. RD and spectral clustering have competitive adjusted Rand index (ARI), even though their time and space consumption is considerable and can reach 2 and 30 times greater than mini batch k-means when applied to the artificial datasets. On the other hand, DBSCAN produces ARI as low as 0% in most default cases but increases up to 100% in almost all experiments of the artificial datasets after varying the parameters. DBSCAN’s accuracy, time, and space consumption, however, are still worse than mini batch k-means.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call