Abstract
A major challenge in clinical cancer research is the identification of accurate molecular subtype. While unsupervised clustering methods have been applied for class discovery, this clustering method remains a bottleneck in developing accurate method for molecular subtype discovery. In this analysis, we hypothesize that spectral clustering method could identify molecular subtypes in correlation with survival outcomes. We propose an accurate subtype identification method, Cancer Subtype Identification with Spectral Clustering using Nyström approximation (CSISCN), for the discovery of molecular subtypes, based on spectral clustering method. CSISCN could be used to improve gene expression-based identification of breast cancer molecular subtypes. We demonstrated that CSISCN identified the molecular subtypes with distinct clinical outcomes and was valid for the number of molecular subtypes. Furthermore, CSISCN identified molecular subtypes for improving clinical and molecular relevance which significantly outperformed consensus clustering and spectral clustering methods. To test the general applicability of the CSISCN, we further applied it on human CRC datasets and AML datasets and demonstrated superior performance as compared to consensus clustering method. In summary, CSISCN demonstrated the great potential in gene expression-based subtype identification.
Highlights
Identifying the subtype of cancer is one of the leading area of study in clinical cancer research
The k clusters were recognized as molecular subtypes to stratify the validation cohort and the prediction performance was evaluated with the Kaplan-Meier survival curves and log-rank test
The CSISCN was applied on different types of cancer to identify molecular subtypes and demonstrated superior performance as compared to consensus clustering and spectral clustering methods
Summary
Identifying the subtype of cancer is one of the leading area of study in clinical cancer research. Spectral clustering is often limited in its application for large-scale problems due to its high computational complexity[27] To address this challenge, the spectral clustering using Nyström approximation is presented to reduce the computational cost of the matrix decomposition and improve the clustering accuracy[28, 29]. We aimed to develop and evaluate spectral clustering method using Nyström approximation for identifying molecular subtypes of cancer. We investigated whether this method could identify molecular subtypes for improving clinical and molecular relevance. We proposed an accurate subtype identification method, Cancer Subtype Identification with Spectral Clustering using Nyström approximation (CSISCN), for the discovery of molecular subtypes, based on spectral clustering method. To test the general applicability of the CSISCN, we further applied it on human CRC datasets and AML datasets and demonstrated superior performance as compared to consensus clustering and spectral clustering methods
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.