Joint Nonnegative Matrix Factorization Based on Sparse and Graph Laplacian Regularization for Clustering and Co-Differential Expression Genes Analysis

Ling-Yun Dai,Juan Wang,Rong Zhu

doi:10.1155/2020/3917812

Abstract

The explosion of multiomics data poses new challenges to existing data mining methods. Joint analysis of multiomics data can make the best of the complementary information that is provided by different types of data. Therefore, they can more accurately explore the biological mechanism of diseases. In this article, two forms of joint nonnegative matrix factorization based on the sparse and graph Laplacian regularization (SG-jNMF) method are proposed. In the method, the graph regularization constraint can preserve the local geometric structure of data.L2,1-norm regularization can enhance the sparsity among the rows and remove redundant features in the data. First, SG-jNMF1 projects multiomics data into a common subspace and applies the multiomics fusion characteristic matrix to mine the important information closely related to diseases. Second, multiomics data of the same disease are mapped into the common sample space by SG-jNMF2, and the cluster structures are detected clearly. Experimental results show that SG-jNMF can achieve significant improvement in sample clustering compared with existing joint analysis frameworks. SG-jNMF also effectively integrates multiomics data to identify co-differentially expressed genes (Co-DEGs). SG-jNMF provides an efficient integrative analysis method for mining the biological information hidden in heterogeneous multiomics data.

Highlights

With the development of state-of-the-art sequencing technology, a large quantity of effective experimental data has been collected. ese data may imply some unknown molecular mechanisms
Bioinformatics is faced with the task of analyzing massive omics data. e Cancer Gene Atlas (TCGA, https://tcgadata.nci.nih.gov/tcga/) includes gene expression profile data (GE), DNA methylation data (DM), copy number variation data (CNV), protein expression data, and drug sensitivity data. ese data are from approximately 15,000 clinical samples of more than 30 kinds of cancers [1]
TCGA project includes a lot of gene expression profile data, DNA methylation data, copy number variation data, protein expression data, drug sensitivity data, and so on

Summary

Introduction

With the development of state-of-the-art sequencing technology, a large quantity of effective experimental data has been collected. ese data may imply some unknown molecular mechanisms. Ese massive data enable researchers to study the mechanisms of cancer production, diagnosis, and treatment at different biological levels. Scientists have performed considerable research on the cancer mechanisms based on the joint analysis of cancer multiomics data. Christina et al integrated the gene expression data and copy number variations of breast cancer, identified possible pathogenic genes, and discovered new subtypes of breast cancer [2]. Wang and Wang used similarity network fusion to jointly analyze mRNA, DM, and microRNA (miRNA) data and identify cancer subtypes further [3]. Liu et al integrated mRNA, somatic cell mutation, DNA methylation, and copy number variation data. Integration and analysis of these heterogeneous multiomics data provide an in-depth understanding of the pathogenesis of cancer and promote the development of precision medicine. Unsupervised integrative methods based on matrix decomposition have attracted considerable

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Complexity	Publication Date: Nov 16, 2020
Citations: 4	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Joint Nonnegative Matrix Factorization Based on Sparse and Graph Laplacian Regularization for Clustering and Co-Differential Expression Genes Analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Complexity

Lead the way for us

Similar Papers

Robust graph regularized unsupervised feature selection
Chang Tang ... Jie Tian
Expert Systems with Applications | VOL. 96
Chang Tang, et. al.Chang Tang ... Jie Tian
02 Dec 2017
Expert Systems with Applications | VOL. 96

Dual graph regularized compact feature representation for unsupervised feature selection
Shaoyong Li ... Jiajia Chen
Neurocomputing | VOL. 331
Shaoyong Li, et. al.Shaoyong Li ... Jiajia Chen
22 Nov 2018
Neurocomputing | VOL. 331

Neighborhood preserving neural network for fault detection
Haitao Zhao ... Zhihui Lai
Neural Networks | VOL. 109
Haitao Zhao, et. al.Haitao Zhao ... Zhihui Lai
01 Oct 2018
Neural Networks | VOL. 109

Spectral clustering-based local and global structure preservation for feature selection
Sihang Zhou ... Xinwang Liu
-
Sihang Zhou, et. al.Sihang Zhou ... Xinwang Liu
01 Jul 2014
01 Jul 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Joint Nonnegative Matrix Factorization Based on Sparse and Graph Laplacian Regularization for Clustering and Co-Differential Expression Genes Analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Complexity