IDCUP Algorithm to Classifying Arbitrary Shapes and Densities for Center-based Clustering Performance Analysis

Saud Altaf,Muhammad Waseem Waseem,Laila Kazmi

doi:10.28945/4541

Abstract

Aim/Purpose: The clustering techniques are normally considered to determine the significant and meaningful subclasses purposed in datasets. It is an unsupervised type of Machine Learning (ML) where the objective is to form groups from objects based on their similarity and used to determine the implicit relationships between the different features of the data. Cluster Analysis is considered a significant problem area in data exploration when dealing with arbitrary shape problems in different datasets. Clustering on large data sets has the following challenges: (1) clusters with arbitrary shapes; (2) less knowledge discovery process to decide the possible input features; (3) scalability for large data sizes. Density-based clustering has been known as a dominant method for determining the arbitrary-shape clusters. Background: Existing density-based clustering methods commonly cited in the literature have been examined in terms of their behavior with data sets that contain nested clusters of varying density. The existing methods are not enough or ideal for such data sets, because they typically partition the data into clusters that cannot be nested. Methodology: A density-based approach on traditional center-based clustering is introduced that assigns a weight to each cluster. The weights are then utilized in calculating the distances from data vectors to centroids by multiplying the distance by the centroid weight. Contribution: In this paper, we have examined different density-based clustering methods for data sets with nested clusters of varying density. Two such data sets were used to evaluate some of the commonly cited algorithms found in the literature. Nested clusters were found to be challenging for the existing algorithms. In utmost cases, the targeted algorithms either did not detect the largest clusters or simply divided large clusters into non-overlapping regions. But, it may be possible to detect all clusters by doing multiple runs of the algorithm with different inputs and then combining the results. This work considered three challenges of clustering methods. Findings: As a result, a center with a low weight will attract objects from further away than a centroid with higher weight. This allows dense clusters inside larger clusters to be recognized. The methods are tested experimentally using the K-means, DBSCAN, TURN*, and IDCUP algorithms. The experimental results with different data sets showed that IDCUP is more robust and produces better clusters than DBSCAN, TURN*, and K-means. Finally, we compare K-means, DBSCAN, TURN*, and to deal with arbitrary shapes problems at different datasets. IDCUP shows better scalability compared to TURN*. Future Research: As future recommendations of this research, we are concerned with the exploration of further available challenges of the knowledge discovery process in clustering along with complex data sets with more time. A hybrid approach based on density-based and model-based clustering algorithms needs to compare to achieve maximum performance accuracy and avoid the arbitrary shapes related problems including optimization. It is anticipated that the comparable kind of the future suggested process will attain improved performance with analogous precision in identification of clustering shapes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Interdisciplinary Journal of Information, Knowledge, and Management	Publication Date: Jan 1, 2020
Citations: 1	License type: CC BY-NC 4.0

R Discovery Prime

R Discovery Prime

IDCUP Algorithm to Classifying Arbitrary Shapes and Densities for Center-based Clustering Performance Analysis

Abstract

Talk to us

Similar Papers

More From: Interdisciplinary Journal of Information, Knowledge, and Management

Lead the way for us

Similar Papers

Knn density-based clustering for high dimensional multispectral images
T.N Tran ... R Wehrens
-
T.N Tran, et. al.T.N Tran ... R Wehrens
22 May 2003
22 May 2003

A Density-Based Spatial Flow Cluster Detection Method
Ran Tao ... Jean-Claude Thill
International Conference on GIScience Short Paper Proceedings | VOL. 1
Ran Tao, et. al.Ran Tao ... Jean-Claude Thill
01 Jan 2015
International Conference on GIScience Short Paper Proceedings | VOL. 1

Landmark FN-DBSCAN: An Efficient Density-Based Clustering Algorithm with Fuzzy Neighborhood
Hao Liu ... Masahito Kurihara
Journal of Advanced Computational Intelligence and Intelligent Informatics | VOL. 17
Hao Liu, et. al.Hao Liu ... Masahito Kurihara
20 Jan 2013
Journal of Advanced Computational Intelligence and Intelligent Informatics | VOL. 17

Clustering Approach on Core-based and Energy-based Vibrating
Shardrom Johnson ... Wu Zhang
-
Shardrom Johnson, et. al.Shardrom Johnson ... Wu Zhang
01 Jan 2004
01 Jan 2004

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

IDCUP Algorithm to Classifying Arbitrary Shapes and Densities for Center-based Clustering Performance Analysis

Abstract

Talk to us

Similar Papers

More From: Interdisciplinary Journal of Information, Knowledge, and Management