Abstract

The clustering method divides a dataset into groups with similar data using similarity metrics. However, discovering clusters in different densities, shapes and distinct sizes is still a challenging task. In this regard, experts and researchers opt to use the DBSCAN algorithm as it uses density-based clustering techniques that define clusters of different sizes and shapes. However, it is misapplied to clusters of different densities due to its global attributes that generate a single density. Furthermore, most existing algorithms are unsupervised methods, where available prior knowledge is useless. To address these problems, this research suggests the use of a clustering algorithm that is semi-supervised. This allows the algorithm to use existing knowledge to generate pairwise constraints for clustering multi-density data. The proposed algorithm consists of two stages: first, it divides the dataset into different sets based on their density level and then applies the semi-supervised DBSCAN algorithm to each partition. Evaluation of the results shows the algorithm performing effectively and efficiently in comparison to unsupervised clustering algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call