Statistical Analysis of Microarray Data Clustering using NMF, Spectral Clustering, Kmeans, and GMM.

Andri Mirzal

doi:10.1109/tcbb.2020.3025486

Abstract

In unsupervised learning literature, the study of clustering using microarray gene expression datasets has been extensively conducted with nonnegative matrix factorization (NMF), spectral clustering, kmeans, and gaussian mixture model (GMM)are some of the most used methods. However, there is still a limited number of works that utilize statistical analysis to measure the significances of performance differences between these methods. In this paper, statistical analysis of performance differences between ten NMF, six spectral clustering, four GMM, and the standard kmeans algorithms in clustering eleven publicly available microarray gene expression datasets with the number of clusters ranges from two to ten is presented. The experimental results show that statistically NMFs and kmeans have similar performances and outperform spectral clustering. As spectral clustering can be used to uncover hidden manifold structures, the underperformance of spectral methods leads us to question whether the datasets have manifold structures. Visual inspection using multidimensional scaling plots indicates that such structures do not exist. Moreover, as the plots indicate that clusters in some datasets have elliptical boundaries, GMM methods are also utilized. The experimental results show that GMM methods outperform the other methods to some degree, and thus imply that the datasets follow gaussian distributions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Statistical Analysis of Microarray Data Clustering using NMF, Spectral Clustering, Kmeans, and GMM.

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Computational Biology and Bioinformatics

Lead the way for us

Journal: IEEE/ACM Transactions on Computational Biology and Bioinformatics	Publication Date: Sep 21, 2020
Citations: 29

Similar Papers

Image Retrieval Based on Multiview Constrained Nonnegative Matrix Factorization and Gaussian Mixture Model Spectral Clustering Method
Qunyi Xie ... Hongqing Zhu
Mathematical Problems in Engineering | VOL. 2016
Qunyi Xie, et. al.Qunyi Xie ... Hongqing Zhu
01 Jan 2015
Mathematical Problems in Engineering | VOL. 2016

Statistical Analysis of Clustering Performances of NMF, Spectral Clustering, and K-means
Andri Mirzal
-
Andri MirzalAndri Mirzal
13 Oct 2020
13 Oct 2020

NMF based gene selection algorithm for improving performance of the spectral cancer clustering
Andri Mirzal
-
Andri MirzalAndri Mirzal
01 Nov 2013
01 Nov 2013

Advances in Nonnegative Matrix and Tensor Factorization
A Cichocki ... M Mørup
Computational Intelligence and Neuroscience | VOL. 2008
A Cichocki, et. al.A Cichocki ... M Mørup
01 Jan 2008
Computational Intelligence and Neuroscience | VOL. 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Statistical Analysis of Microarray Data Clustering using NMF, Spectral Clustering, Kmeans, and GMM.

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Computational Biology and Bioinformatics