Abstract

DBSCAN is a base algorithm for density-based clustering. It can find out the clusters of different shapes and sizes from a large amount of data, which is containing noise and outliers. However, it is fail to handle the local density variation that exists within the cluster. Thus, a good clustering method should allow a significant density variation within the cluster because, if we go for homogeneous clustering, a large number of smaller unimportant clusters may be generated. In this paper, an enhancement of DBSCAN algorithm is proposed, which detects the clusters of different shapes and sizes that differ in local density. Our proposed method VMDBSCAN first finds out the “core” of each cluster—clusters generated after applying DBSCAN. Then, it “vibrates” points toward the cluster that has the maximum influence on these points. Therefore, our proposed method can find the correct number of clusters.

Highlights

  • Unsupervised clustering techniques are an important data analysis task that tries to organize the data set into separated groups with respect to a distance or, equivalently, a similarity measure [1]

  • EDBSCAN [26] algorithm is another extension of DBSCAN; it keeps tracks of density variation which exists within the cluster

  • Failing to detect the density varied clusters, there are many researches existing as an enhancement of DBSCAN for handling the density variation within the cluster

Read more

Summary

Introduction

Unsupervised clustering techniques are an important data analysis task that tries to organize the data set into separated groups with respect to a distance or, equivalently, a similarity measure [1]. Hierarchal algorithms use distance measurements between the objects and between the clusters. Grid-based algorithms quantize the object space into a finite number of cells (hyper-rectangles) and perform the required operations on the quantized space The advantage of this approach is the fast processing time that is in general independent of the number of data objects. Model-based algorithms find good approximations of model parameters that best fit the data They can be either partitional or hierarchical, depending on the structure or model they hypothesize about the data set and the way they refine this model to identify partitionings. They are closer to density-based algorithms in that they grow particular clusters so that the preconceived model is improved.

Related Work
DBSCAN Algorithm
The Proposed Algorithm
Simulation and Results
Conclusions and Future Works
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call