Abstract

Density-based clustering has been widely used in many fields. A new effective grid-based and density-based spatial clustering algorithm, GRIDEN, is proposed in this paper, which supports parallel computing in addition to multi-density clustering. It constructs grids using hyper-square cells and provides users with parameter k to control the balance between efficiency and accuracy to increase the flexibility of the algorithm. Compared with conventional density-based algorithms, it achieves much higher performance by eliminating distance calculations among points based on the newly proposed concept of ε-neighbor cells. Compared with conventional grid-based algorithms, it uses a set of symmetric (2k+1)D cells to identify dense cells and the density-connected relationships among cells. Therefore, the maximum calculated deviation of ε-neighbor points in the grid-based algorithm can be controlled to an acceptable level through parameter k. In our experiments, the results demonstrate that GRIDEN can achieve a reliable clustering result that is infinite closed with respect to the exact DBSCAN as parameter k grows, and it requires computational time that is only linear to N.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call