Abstract

In the big data era, sequencing technology has produced a large number of biological sequencing data. Different views of the cancer genome data provide sufficient complementary information to explore genetic activity. The identification of differentially expressed genes from multiview cancer gene data is of great importance in cancer diagnosis and treatment. In this paper, we propose a novel method for identifying differentially expressed genes based on tensor robust principal component analysis (TRPCA), which extends the matrix method to the processing of multiway data. To identify differentially expressed genes, the plan is carried out as follows. First, multiview data containing cancer gene expression data from different sources are prepared. Second, the original tensor is decomposed into a sum of a low-rank tensor and a sparse tensor using TRPCA. Third, the differentially expressed genes are considered to be sparse perturbed signals and then identified based on the sparse tensor. Fourth, the differentially expressed genes are evaluated using Gene Ontology and Gene Cards tools. The validity of the TRPCA method was tested using two sets of multiview data. The experimental results showed that our method is superior to the representative methods in efficiency and accuracy aspects.

Highlights

  • In the rapid development of sequencing technology, large amounts of gene expression data have been generated

  • We propose a novel method for identifying differentially expressed genes based on tensor robust principal component analysis (TRPCA), which extends the matrix method to the processing of multiway data

  • The P-value is the probability or opportunity to observe at least x of the total n genes in a list annotated to a particular GO term, given the proportion of genes annotated to the GO term in the entire genome

Read more

Summary

Introduction

In the rapid development of sequencing technology, large amounts of gene expression data have been generated. Cancer (malignant tumor) is the common type of disease in this era and poses a serious threat to human health. The analysis of expression data can help explore the origin of life and understand differences between individuals. Genes are common determinants of in vivo cancer or tumor onset, which are identified as abnormally expressed. On the one hand, identifying differentially expressed genes can help people explore the association between different diseases. Techniques to screen differentially expressed genes from gene expression data have gained much attention [1]. These data consist of tens of thousands of genes and hundreds of samples

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call