Abstract

Genetic information becomes more and more important in the process of biological research. Gene analysis is an effective mean in biological research, especially the analysis of differentially expressed genes. Robust principal component analysis (RPCA) is an effective method to identify differentially expressed genes. But tensor robust principal component analysis (TRPCA) performs better than RPCA when processing multi-dimensional data. The traditional TRPCA method also has limitations in restoring low-rank sparse components. To further improve the accuracy of the TRPCA method in restoring low-rank components and sparse components, we propose a novel TRPCA method to obtain high-order correlations information of multi-dimensional data. It uses a new nuclear norm based on t-product operator to approximate the rank function. The $L_{2,1}$ -norm is used to improve the sparsity of tensors and reduce the negative effects caused by noises and outliers. At the same time, the introduction of $L_{2,1}$ -norm enhances the sparsity of error components, and improves the accuracy of low-rank component recovery. The low-rank sparse components are obtained by solving the convex problem of the new tensor nuclear norm. It can well preserve the spatial structure and make full use of complementary information to improve the clustering effect. Alternating direction method of multiplier (ADMM) is used to solve the optimization problem of this method. Experimental results on different cancer genomic datasets indicate that our method is superior to other methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call