Abstract

The Black Hole Clustering (BHC) algorithm is a density-based partitional clustering method inspired by the Density-based Spatial Clustering of Applications with Noise (DBSCAN). It does not require the number of clusters nor the computation of the pair-wise distance matrix between the data points, making it faster than DBSCAN. Also, it only needs one parameter that is intuitively easier to set than the epsilon parameter of DBSCAN. However, BHC needs the allocation of the so-called black holes that have to be linearly independent, making the algorithm in its current version suitable only for two or three-dimensional data sets. In this paper, we propose a generalized version of the black hole clustering algorithm (GBHC) by introducing a novel black hole allocation procedure for higher-dimensional data spaces. Furthermore, the proposed method is data-independent, so we have to run it once to obtain the black hole positions for all finite-dimensional metric spaces. We performed extensive computational experiments to compare GBHC with DBSCAN. The results show that both algorithms obtain comparable clustering solutions. GBHC, however, outperforms DBSCAN in computational complexity and explainability.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call