A Rapid Hybrid Clustering Algorithm for Large Volumes of High Dimensional Data

Punit Rathore,Dheeraj Kumar,Marimuthu Palaniswami,Sutharshan Rajasegarar,James C Bezdek

doi:10.1109/tkde.2018.2842191

Abstract

Clustering large volumes of high-dimensional data is a challenging task. Many clustering algorithms have been developed to address either handling datasets with a very large sample size or with a very high number of dimensions, but they are often impractical when the data is large in both aspects. To simultaneously overcome both the ‘curse of dimensionality’ problem due to high dimensions and scalability problems due to large sample size, we propose a new fast clustering algorithm called FensiVAT. FensiVAT is a hybrid, ensemble-based clustering algorithm which uses fast data-space reduction and an intelligent sampling strategy. In addition to clustering, FensiVAT also provides visual evidence that is used to estimate the number of clusters (cluster tendency assessment) in the data. In our experiments, we compare FensiVAT with nine state-of-the-art approaches which are popular for large sample size or high-dimensional data clustering. Experimental results suggest that FensiVAT, which can cluster large volumes of high-dimensional datasets in a few seconds, is the fastest and most accurate method of the ones tested.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Rapid Hybrid Clustering Algorithm for Large Volumes of High Dimensional Data

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Knowledge and Data Engineering

Lead the way for us

Journal: IEEE Transactions on Knowledge and Data Engineering	Publication Date: Aug 1, 2018
Citations: 106

Similar Papers

Sketched Subspace Clustering
Panagiotis A Traganitis ... Georgios B Giannakis
IEEE Transactions on Signal Processing | VOL. 66
Panagiotis A Traganitis, et. al.Panagiotis A Traganitis ... Georgios B Giannakis
07 Feb 2018
IEEE Transactions on Signal Processing | VOL. 66

M-Denclue for Effective Data Clustering in High Dimensional Non-Linear Data
-
International Journal of Innovative Technology and Exploring Engineering | VOL. 9
--
10 Nov 2019
International Journal of Innovative Technology and Exploring Engineering | VOL. 9

Clustering High-Dimensional Stock Data using Data Mining Approach
Dhea Indriyanti ... Arian Dhini
-
Dhea Indriyanti, et. al.Dhea Indriyanti ... Arian Dhini
01 Jul 2019
01 Jul 2019

Clustering of High Dimensional Handwritten Data by an Improved Hypergraph Partition Method
Tian Wang ... Yonggang Lu
-
Tian Wang, et. al.Tian Wang ... Yonggang Lu
01 Jan 2017
01 Jan 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Rapid Hybrid Clustering Algorithm for Large Volumes of High Dimensional Data

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Knowledge and Data Engineering