K-MACE and Kernel K-MACE Clustering

Soosan Beheshti,Edward Nidoy,Faizan Rahman

doi:10.1109/access.2020.2968290

Abstract

Determining the correct number of clusters (CNC) is an important task in data clustering and has a critical effect on nalizing the partitioning results. K-means is one of the popular methods of clustering that requires CNC. Validity index methods use an additional optimization procedure to estimate the CNC for K-means. We propose an alternative validity index approach denoted by k-Minimizing Average Central Error (KMACE). Average Central Error (ACE) is the average error between the unavailable cluster center and the estimated cluster center for each sample data. Kernel K-MACE is kernel K-means that is equipped with the proposed CNC estimator. In addition, kernel K-MACE includes an automatically tuned procedure for choosing the Gaussian kernel parameters. Simulation results for both synthetic and real data show superiority of K-MACE and kernel K-MACE over the conventional clustering methods not only in CNC estimation but also in the partitioning procedure.

Highlights

Clustering is one of the most used unsupervised learning tasks where unlabeled observed data samples are grouped based on their similarities and dissimilarities
The Kernel parameter governs the separability of clusters in feature space and its optimum value corresponds to the true estimation of correct number of clusters (CNC). We propose another important feature in the kernel k-minimizing ACE (K-MACE) clustering algorithm that automatically tunes to the optimum Gaussian kernel parameters
We compare K-MACE and kernel-k-Minimizing Average Central Error (KMACE) with two divisive hierarchical clustering methods that are partitioning clustering schemes, G-means [16] that is mainly proposed for Gaussian clustering, as well as Dip-means [17] that a most recent approach

Summary

INTRODUCTION

Clustering is one of the most used unsupervised learning tasks where unlabeled observed data samples are grouped based on their similarities and dissimilarities. While clustering assignment on K-means is based on the distance of a sample to its cluster center, another family of clustering algorithms are density based where clusters are formed by grouping samples based on their proximity with respect to their neighboring samples These methods provide the CNC estimate simultaneously. The existing validity index methods, used with K-means, employ the available cluster compactness for CNC estimate to optimize a criterion that is not similar to the K-means partitioning criterion. Note that in addition to the number of clusters m, existing Kernel based clustering methods require tuning the kernel function parameters This is currently done by trial and error and no method of validation and choosing the optimum parameters is available.

PROBLEM STATEMENT

M-CLUSTERING NOTATIONS

ACE IN M-CLUSTERING

MEAN AND VARIANCE OF ACE

ACE MEAN ESTIMATION

BOUNDS ON ACE

VALIDATION AND CONFIDENCE PROBABILITIES

K-MACE USING THE AVAILABLE DATA

INITIAL CLUSTER ASSIGNMENT IN KERNEL K-MACE

K-MACE IN FEATURE SPACE

OPTIMUM GAUSSIAN KERNEL PARAMETER

SIMULATIONS AND RESULTS

REAL DATA

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

K-MACE and Kernel K-MACE Clustering

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Kernel K-Mace Clustering
Faizan Rahman ... Soosan Beheshti
-
Faizan Rahman, et. al.Faizan Rahman ... Soosan Beheshti
01 Oct 2018
01 Oct 2018

Correct Number of Clusters (CNC) Description Length in Arbitrary Shape Clustering
Mahdi Shamsi ... Faizan Rahman
-
Mahdi Shamsi, et. al.Mahdi Shamsi ... Faizan Rahman
01 Jun 2019
01 Jun 2019

MACE-means clustering
Mahdi Shahbaba ... Soosan Beheshti
Signal Processing | VOL. 105
Mahdi Shahbaba, et. al.Mahdi Shahbaba ... Soosan Beheshti
09 Jun 2014
Signal Processing | VOL. 105

Order Selection in Unsupervised Learning and Clustering for Arbitrary and Non-Arbitrary Shaped Data
Mahdi Shahbaba
-
Mahdi ShahbabaMahdi Shahbaba
24 May 2021
24 May 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

K-MACE and Kernel K-MACE Clustering

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access