CciMST: A Clustering Algorithm Based on Minimum Spanning Tree and Cluster Centers

Xiaobo Lv,Yan Ma,Hui Huang,Xiaofu He,Jie Yang

doi:10.1155/2018/8451796

Abstract

The minimum spanning tree- (MST-) based clustering method can identify clusters of arbitrary shape by removing inconsistent edges. The definition of the inconsistent edges is a major issue that has to be addressed in all MST-based clustering algorithms. In this paper, we propose a novel MST-based clustering algorithm through the cluster center initialization algorithm, called cciMST. First, in order to capture the intrinsic structure of the data sets, we propose the cluster center initialization algorithm based on geodesic distance and dual densities of the points. Second, we propose and demonstrate that the inconsistent edge is located on the shortest path between the cluster centers, so we can find the inconsistent edge with the length of the edges as well as the densities of their endpoints on the shortest path. Correspondingly, we obtain two groups of clustering results. Third, we propose a novel intercluster separation by computing the distance between the points at the intersection of clusters. Furthermore, we propose a new internal clustering validation measure to select the best clustering result. The experimental results on the synthetic data sets, real data sets, and image data sets demonstrate the good performance of the proposed MST-based method.

Highlights

Clustering aims to group a set of objects into clusters such that the objects of the same cluster are similar, and objects belonging to different clusters are dissimilar
We propose a novel minimum spanning tree (MST)-based clustering algorithm through the cluster center initialization algorithm, called cciMST
The rest of this paper is organized as follows: in Section 2, we review some existing work on minimum spanning tree- (MST-)based clustering algorithms

Summary

Introduction

Clustering aims to group a set of objects into clusters such that the objects of the same cluster are similar, and objects belonging to different clusters are dissimilar. The different clustering methods, such as partitional, hierarchical, density-based, and grid-based approaches, are not completely satisfactory due to the multiplicity of problems and the data distributions [2,3,4]. As a well-known partitional clustering algorithm, the K-means algorithm often assumes a spherical shape structure of the underlying data, and it can detect clusters with irregular boundaries. DBSCAN is a classical densitybased clustering algorithm that can find clusters with arbitrary shapes. It needs to input four parameters which are difficult to determine [4]. The shape of the cluster boundary has little impact on the performance of the algorithm, which allows us to overcome the problems commonly faced by the classical clustering algorithms [6]

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Mathematical Problems in Engineering	Publication Date: Dec 17, 2018
Citations: 16	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

CciMST: A Clustering Algorithm Based on Minimum Spanning Tree and Cluster Centers

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematical Problems in Engineering

Lead the way for us

Similar Papers

A Clustering Algorithm Based on the Combination of MST and Cluster Centers
Xiao-Bo Lv ... Xiao-Fu He
DEStech Transactions on Computer Science and Engineering | VOL. -
Xiao-Bo Lv, et. al.Xiao-Bo Lv ... Xiao-Fu He
27 Mar 2018
DEStech Transactions on Computer Science and Engineering | VOL. -

MST-Based Cluster Initialization for K-Means
Damodar Reddy ... Devender Mishra
-
Damodar Reddy, et. al.Damodar Reddy ... Devender Mishra
01 Jan 2010
01 Jan 2010

Modified genetic algorithm-based clustering for probability density functions
Tai Vo-Van ... Thao Nguyen-Trang
Journal of Statistical Computation and Simulation | VOL. 87
Tai Vo-Van, et. al.Tai Vo-Van ... Thao Nguyen-Trang
12 Mar 2017
Journal of Statistical Computation and Simulation | VOL. 87

Real and synthetic data sets for benchmarking key-value stores focusing on various data types and sizes
Hyuk-Yoon Kwon
Data in Brief | VOL. 30
Hyuk-Yoon KwonHyuk-Yoon Kwon
20 Mar 2020
Data in Brief | VOL. 30

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CciMST: A Clustering Algorithm Based on Minimum Spanning Tree and Cluster Centers

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematical Problems in Engineering