Multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints

Juan Wang,Jin-Xing Liu,Xiang-Zhen Kong,Ling-Yun Dai,Cong-Hai Lu

doi:10.1186/s12859-019-3231-5

Abstract

BackgroundIdentifying different types of cancer based on gene expression data has become hotspot in bioinformatics research. Clustering cancer gene expression data from multiple cancers to their own class is a significance solution. However, the characteristics of high-dimensional and small samples of gene expression data and the noise of the data make data mining and research difficult. Although there are many effective and feasible methods to deal with this problem, the possibility remains that these methods are flawed.ResultsIn this paper, we propose the graph regularized low-rank representation under symmetric and sparse constraints (sgLRR) method in which we introduce graph regularization based on manifold learning and symmetric sparse constraints into the traditional low-rank representation (LRR). For the sgLRR method, by means of symmetric constraint and sparse constraint, the effect of raw data noise on low-rank representation is alleviated. Further, sgLRR method preserves the important intrinsic local geometrical structures of the raw data by introducing graph regularization. We apply this method to cluster multi-cancer samples based on gene expression data, which improves the clustering quality. First, the gene expression data are decomposed by sgLRR method. And, a lowest rank representation matrix is obtained, which is symmetric and sparse. Then, an affinity matrix is constructed to perform the multi-cancer sample clustering by using a spectral clustering algorithm, i.e., normalized cuts (Ncuts). Finally, the multi-cancer samples clustering is completed.ConclusionsA series of comparative experiments demonstrate that the sgLRR method based on low rank representation has a great advantage and remarkable performance in the clustering of multi-cancer samples.

Highlights

Identifying different types of cancer based on gene expression data has become hotspot in bioinformatics research
Motivated by the above methods, in order to obtain a better lowest rank matrix that can avoid the simple symmetric operation and preserve the intrinsic local geometrical structures within the raw high-dimensional dataset, we introduce symmetric sparse constraints and graph regularization based on manifold learning into the LRR method, and propose the graph regularized low-rank representation method under combined the sparse and symmetric constraints, or short sgLRR method
In this paper, we introduce graph regularization based on manifold learning and symmetric sparse constraints into the original LRR and propose a novel method called the sgLRR

Summary

Introduction

Identifying different types of cancer based on gene expression data has become hotspot in bioinformatics research. The researchers have proposed many well-performing methods and used them for gene expression data mining, such as K-means clustering [6], nonnegative matrix factorization (NMF) [7] and principal component analysis (PCA) [8]. Because of the high dimensional nature of gene expression data, the low-rank representation (LRR) method has become a popular and promising method since its prototype was proposed by Liu et al [9]. The LRR method can preserve the subspace structure of the raw dataset in a lowest rank representation matrix. The LRR clustering method has been adopted widely in many fields due to the advantages of the lowest rank representation matrix, such as facial recognition [11], genetic microarray data clustering [12], image clustering [13] and subspace segmentation [14]. LRR method achieves good results in processing high-dimensional datasets

Objectives

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Dec 1, 2019
Citations: 9	License type: open-access

R Discovery Prime

R Discovery Prime

Multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Cascaded Low Rank and Sparse Representation on Grassmann Manifolds
Boyue Wang ... Yanfeng Sun
-
Boyue Wang, et. al.Boyue Wang ... Yanfeng Sun
01 Jul 2018
01 Jul 2018

Non-negative consistency affinity graph learning for unsupervised feature selection and clustering
Ziwei Xu ... Xiuhong Chen
Engineering Applications of Artificial Intelligence | VOL. 135
Ziwei Xu, et. al.Ziwei Xu ... Xiuhong Chen
10 Jun 2024
Engineering Applications of Artificial Intelligence | VOL. 135

Robust domain adaptation image classification via sparse and low rank representation
Jianwen Tao ... Wenjun Hu
Journal of Visual Communication and Image Representation | VOL. 33
Jianwen Tao, et. al.Jianwen Tao ... Wenjun Hu
25 Sep 2015
Journal of Visual Communication and Image Representation | VOL. 33

Multi-View Random-Walk Graph Regularization Low-Rank Representation for Cancer Clustering and Differentially Expressed Gene Selection.
Juan Wang ... Jin-Xing Liu
IEEE journal of biomedical and health informatics | VOL. 26
Juan Wang, et. al.Juan Wang ... Jin-Xing Liu
01 Jul 2022
IEEE journal of biomedical and health informatics | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics