Abstract

DBSCAN is a fundamental density-based clustering algorithm with extensive applications. However, a bottleneck of DBSCAN is its O(n2) worst-case time complexity. In this paper, we propose an algorithm called GAP-DBC, which exploits the geometric relationships between points to solve this problem. GAP-DBC introduces an efficient partitioning algorithm to partition the data set with a limited number of range queries and then establishes an initial cluster structure based on the partition. GAP-DBC proceeds to iteratively refine the cluster structure by additional range queries. Finally, the cluster structure is accomplished using an iterative algorithm that utilizes the spatial relationships among points to reduce unnecessary distance calculations. We further demonstrate theoretically that GAP-DBC has an excellent guarantee in terms of computational efficiency. We conducted experiments on both synthetic and real-world data sets to evaluate the performance of GAP-DBC. The results show that our algorithm is competitive with other state-of-the-art algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call