Nuclear Norm Clustering: a promising alternative method for clustering tasks

Yi Wang,Yin Yao Shugart,Yi Li,Momiao Xiong,Chunhong Qiao,Meng Hao,Li Jin,Xiaoyu Liu

doi:10.1038/s41598-018-29246-4

Abstract

Clustering techniques are widely used in many applications. The goal of clustering is to identify patterns or groups of similar objects within a dataset of interest. However, many cluster methods are neither robust nor sensitive to noises and outliers in real data. In this paper, we present Nuclear Norm Clustering (NNC, available at https://sourceforge.net/projects/nnc/), an algorithm that can be used in various fields as a promising alternative to the k-means clustering method. The NNC algorithm requires users to provide a data matrix M and a desired number of cluster K. We employed simulated annealing techniques to choose an optimal label vector that minimizes nuclear norm of the pooled within cluster residual matrix. To evaluate the performance of the NNC algorithm, we compared the performance of both 15 public datasets and 2 genome-wide association studies (GWAS) on psoriasis, comparing our method with other classic methods. The results indicate that NNC method has a competitive performance in terms of F-score on 15 benchmarked public datasets and 2 psoriasis GWAS datasets. So NNC is a promising alternative method for clustering tasks.

Highlights

Clustering is defined as grouping objects in sets
We developed the Nuclear Norm Clustering (NNC) method, a highly accurate and robust algorithm used for clustering analysis
We observed that the datasets in which NNC performed better were linearly separable

Summary

Introduction

Clustering is defined as grouping objects in sets. A good clustering method will generate clusters with a high intra-class similarity and a low inter-class similarity[1]. The Partitioning Around Medoids (PAM) is a clustering algorithm related to the k-means clustering and the medoids shift algorithm[4]. Both the k-means and PAM are partitional (breaking the dataset up into groups) and both attempt to minimize the distance between points labeled to be in a cluster and a point designated as the center of that cluster. In contrast to the k-means clustering, PAM chooses data points as centers and works with a generalization of the Manhattan Norm to define distance between data points. The PAM method was proposed in 1987 and is a classical partitioning technique of clustering that clusters the dataset of n objects into k clusters.

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific Reports	Publication Date: Jul 18, 2018
Citations: 5	License type: open-access

R Discovery Prime

R Discovery Prime

Nuclear Norm Clustering: a promising alternative method for clustering tasks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Similar Papers

Machine learning techniques for mitoses classification
Shima Nofallah ... Linda Shapiro
Computerized Medical Imaging and Graphics | VOL. 87
Shima Nofallah, et. al.Shima Nofallah ... Linda Shapiro
27 Nov 2020
Computerized Medical Imaging and Graphics | VOL. 87

R58: Stabilisation de CDC25B : sortie de mitose anormale et instabilité génétique.
Y Thomas ... V Baldin
Bulletin du Cancer | VOL. 97
Y Thomas, et. al.Y Thomas ... V Baldin
01 Oct 2010
R58: Stabilisation de CDC25B : sortie de mitose anormale et instabilité génétique.
Y Thomas ... V Baldin

Latent Forests to Model Genetical Data for the Purpose of Multilocus Genome-Wide Association Studies. Which Clustering Should Be Chosen?
Duc-Thanh Phan ... Christine Sinoquet
-
Duc-Thanh Phan, et. al.Duc-Thanh Phan ... Christine Sinoquet
01 Jan 2015
01 Jan 2015

Comparing Four Genome-Wide Association Study (GWAS) Programs with Varied Input Data Quantity
Yan Yan ... Juxin Liu
-
Yan Yan, et. al.Yan Yan ... Juxin Liu
01 Dec 2018
01 Dec 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Nuclear Norm Clustering: a promising alternative method for clustering tasks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports