Fusion of Centroid-Based Clustering With Graph Clustering: An Expectation-Maximization-Based Hybrid Clustering.

Zekeriya Uykan

doi:10.1109/tnnls.2021.3121224

Abstract

This article extends the expectation-maximization (EM) formulation for the Gaussian mixture model (GMM) with a novel weighted dissimilarity loss. This extension results in the fusion of two different clustering methods, namely, centroid-based clustering and graph clustering in the same framework in order to leverage their advantages. The fusion of centroid-based clustering and graph clustering results in a simple "soft" asynchronous hybrid clustering method. The proposed algorithm may start as a pure centroid-based clustering algorithm (e.g., k -means), and as the time evolves, it may eventually and gradually turn into a pure graph clustering algorithm [e.g., basic greedy asynchronous distributed interference avoidance (GADIA) (Babadi and Tarokh, 2010)] as the algorithm converges and vice versa. The "hard" version of the proposed hybrid algorithm includes the standard Hopfield neural networks (and, thus, Bruck's Ln algorithm by (Bruck, 1990) and the Ising model in statistical mechanics), Babadi and Tarokh's basic GADIA in 2010, and the standard k -means (Steinhaus, 1956), (MacQueen, 1967) [i.e., the Lloyd algorithm (Lloyd, 1957, 1982)] as its special cases. We call the "hard version" of the proposed clustering as "hybrid-nongreedy asynchronous clustering (H-NAC)." We apply the H-NAC to various clustering problems using well-known benchmark datasets. The computer simulations confirm the superior performance of the H-NAC compared to the k -means clustering, k -GADIA, spectral clustering, and a very recent clustering algorithm structured graph learning (SGL) by Kang et al. (2021), which represents one of the state-of-the-art clustering algorithms.

Highlights

AND MOTIVATIONC LUSTERING is a fundamental mechanism in data processing and machine learning applications, and it is a fundamental research area
The extended k-Ln clustering algorithm turns out to be equivalent to the basic version of the pioneering algorithm greedy asynchronous distributed interference avoidance (GADIA) of Babadi and Tarokh [2]
The reason why we have chosen the structured graph learning (SGL) in [46] as a reference algorithm in our article is because the SGL [46] has shown superior performance compared to many state-of-the-art clustering methods, such as the accelerated low-rank representation (ALRR) published in 2018 [72], the K -multiple-means (KMMs) in 2019 [73], the efficient sparse subspace clustering (ESSC) in 2020 [74], the fast normalized cut (FNC) in 2018 [75], and the sparse subspace clusteringorthogonal matching pursuit (SSC-OMP) (a popular sparse subspace clustering (SSC) algorithm) [76]

Summary

Introduction

C LUSTERING is a fundamental mechanism in data processing and machine learning applications, and it is a fundamental research area It is the task of grouping a set of objects so that objects in the same group are more similar to each other than to those in other groups, and it helps in understanding and discovering the natural grouping in a dataset. There is not a magical clustering method that solves all different types of challenging real-life clustering problems with the best performance. Distribution-based clustering suffers from an overfitting problem it has a strong theoretical foundation. Another prominent method, the Gaussian mixture model (GMM), assumes Gaussian distributions, which is a rather strong assumption for various real-life datasets. For a given particular clustering problem and its datasets, we often end up determining the most appropriate clustering algorithm experimentally

Objectives

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Neural Networks and Learning Systems	Publication Date: Aug 1, 2023
Citations: 17	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Fusion of Centroid-Based Clustering With Graph Clustering: An Expectation-Maximization-Based Hybrid Clustering.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Transactions on Neural Networks and Learning Systems

Lead the way for us

Similar Papers

Graph Clustering: Algorithms, Analysis and Query Design

-

01 Jan 2018
01 Jan 2018

Solving Models in Statistical Mechanics
R.J Baxter
Integrable Systems in Quantum Field Theory and Statistical Mechanics | VOL. 19
R.J BaxterR.J Baxter
01 Jan 1989
Integrable Systems in Quantum Field Theory and Statistical Mechanics | VOL. 19

Improved spectral clustering using PCA based similarity measure on different Laplacian graphs
K R Kavitha ... P R Praveen
-
K R Kavitha, et. al.K R Kavitha ... P R Praveen
01 Dec 2016
01 Dec 2016

Personalized PageRank Clustering: A graph clustering algorithm based on random walks
Shayan A Tabrizi ... Mohammad Ali Tavallaie
Physica A: Statistical Mechanics and its Applications | VOL. 392
Shayan A Tabrizi, et. al.Shayan A Tabrizi ... Mohammad Ali Tavallaie
24 Jul 2013
Physica A: Statistical Mechanics and its Applications | VOL. 392

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Fusion of Centroid-Based Clustering With Graph Clustering: An Expectation-Maximization-Based Hybrid Clustering.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Transactions on Neural Networks and Learning Systems