The identification of cancer subtypes is of great importance for understanding the heterogeneity of tumors and providing patients with more accurate diagnoses and treatments. However, it is still a challenge to effectively integrate multiple omics data to establish cancer subtypes. In this paper, we propose an unsupervised integration method, named weighted multi-view low rank representation (WMLRR), to identify cancer subtypes from multiple types of omics data. Given a group of patients described by multiple omics data matrices, we first learn a unified affinity matrix which encodes the similarities among patients by exploring the sparsity-consistent low-rank representations from the joint decompositions of multiple omics data matrices. Unlike existing subtype identification methods that treat each omics data matrix equally, we assign a weight to each omics data matrix and learn these weights automatically through the optimization process. Finally, we apply spectral clustering on the learned affinity matrix to identify cancer subtypes. Experiment results show that the survival times between our identified cancer subtypes are significantly different, and our predicted survivals are more accurate than other state-of-the-art methods. In addition, some clinical analyses of the diseases also demonstrate the effectiveness of our method in identifying molecular subtypes with biological significance and clinical relevance.
Read full abstract