A new data clustering algorithm based on critical distance methodology

Farag Hamed Kuwil,Fadi Shaar,Ahmet Ercan Topcu,Fionn Murtagh

doi:10.1016/j.eswa.2019.03.051

Abstract

A variety of algorithms have recently emerged in the field of cluster analysis. Consequently, based on the distribution nature of the data, an appropriate algorithm can be chosen for the purpose of clustering. It is difficult for a user to decide a priori which algorithm would be the most appropriate for a given dataset. Algorithms based on graphs provide good results for this task. However, these algorithms are vulnerable to outliers with limited information about edges contained in the tree to split a dataset. Thus, in several fields, the need for better clustering algorithms increases and for this reason utilizing robust and dynamic algorithms to improve and simplify the whole process of data clustering has become an urgent need. In this paper, we propose a novel distance-based clustering algorithm called the critical distance clustering algorithm. This algorithm depends on the Euclidean distance between data points and some basic mathematical statistics operations. The algorithm is simple, robust, and flexible; it works with quantitative data that are real-valued, not qualitative, and categorical with different dimensions. In this work, 26 experiments are conducted using different types of real and synthetic datasets taken from different fields. The results prove that the new algorithm outperforms some popular clustering algorithms such as MST-based clustering, K-means, and Dbscan. Moreover, the algorithm can precisely produce more reasonable clusters even when the dataset contains outliers and without specifying any parameters in advance. It also provides a number of indicators to evaluate the established clusters and prove the validity of the clustering.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A new data clustering algorithm based on critical distance methodology

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications

Lead the way for us

Journal: Expert Systems with Applications	Publication Date: Mar 29, 2019
Citations: 32

Similar Papers

Clustering Algorithm of Density Difference Optimized by Mixed Teaching and Learning
Hailong Chen ... Miaomiao Ge
SN Computer Science | VOL. 1
Hailong Chen, et. al.Hailong Chen ... Miaomiao Ge
01 May 2020
SN Computer Science | VOL. 1

An Effective Analysis of Data Clustering using Distance-based K- Means Algorithm
P Ramkumar ... M Sheela Devi
Journal of Physics: Conference Series | VOL. 1979
P Ramkumar, et. al.P Ramkumar ... M Sheela Devi
01 Aug 2021
Journal of Physics: Conference Series | VOL. 1979

A Novel Adaptive Kernel Picture Fuzzy C-Means Clustering Algorithm Based on Grey Wolf Optimizer Algorithm
Can-Ming Yang ... Sheng Duan
Symmetry | VOL. 14
Can-Ming Yang, et. al.Can-Ming Yang ... Sheng Duan
13 Jul 2022
Symmetry | VOL. 14

Minimum spanning tree‐based cluster analysis: A new algorithm for determining inconsistent edges
Fadi Şaar ... Ahmet E Topcu
Concurrency and Computation: Practice and Experience | VOL. 34
Fadi Şaar, et. al.Fadi Şaar ... Ahmet E Topcu
18 Nov 2021
Concurrency and Computation: Practice and Experience | VOL. 34

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A new data clustering algorithm based on critical distance methodology

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications