Abstract
One of the most important research areas in personalized medicine is the discovery of disease sub-types with relevance in clinical applications. This is usually accomplished by exploring gene expression data with unsupervised clustering methodologies. Then, with the advent of multiple omics technologies, data integration methodologies have been further developed to obtain better performances in patient separability. However, these methods do not guarantee the survival separability of the patients in different clusters. We propose a new methodology that first computes a robust and sparse correlation matrix of the genes, then decomposes it and projects the patient data onto the first m spectral components of the correlation matrix. After that, a robust and adaptive to noise clustering algorithm is applied. The clustering is set up to optimize the separation between survival curves estimated cluster-wise. The method is able to identify clusters that have different omics signatures and also statistically significant differences in survival time. The proposed methodology is tested on five cancer datasets downloaded from The Cancer Genome Atlas repository. The proposed method is compared with the Similarity Network Fusion (SNF) approach, and model based clustering based on Student's t-distribution (TMIX). Our method obtains a better performance in terms of survival separability, even if it uses a single gene expression view compared to the multi-view approach of the SNF method. Finally, a pathway based analysis is accomplished to highlight the biological processes that differentiate the obtained patient groups. Our R source code is available online at https://github.com/angy89/RobustClusteringPatientSubtyping. Supplementary data are available at Bioinformatics online.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.