Abstract

The main task in data mining is to group data into meaningful clusters for which many clustering algorithms are used. Among various clustering algorithms the most effective one is DBSCAN algorithm which can be used for different application. This algorithm is noted as high quality density based method which has several advantages like identifying the arbitrarily shaped clusters, the number of clusters to be used need not be predefined, and it can identify the outliers and can ignore it before clustering. The two main input parameters used are Epsilon (Eps) and minimum number of points (MinPts) has great effect on clustering performance. Hence to solve this problem automatic selection of Eps and MinPts is done using K-distance graph method and neighbourhood calculation for each data point is speeded up using spatial access methods. The proposed new algorithm which makes use of spatial access method and k-distance graph method to increase the performance in terms of scalability and speeds up the execution process. The experimental results obtained clearly shows that by combining K-Distance tree method with DBSCAN algorithm is efficient in terms of all the metrics and can cluster both high and low dimensional data efficiently.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call