Abstract

BackgroundComprehensive molecular profiling has revealed somatic variations in cancer at genomic, epigenomic, transcriptomic, and proteomic levels. The accumulating data has shown clearly that molecular phenotypes of cancer are complex and influenced by a multitude of factors. Conventional unsupervised clustering applied to a large patient population is inevitably driven by the dominant variation from major factors such as cell-of-origin or histology. Translation of these data into clinical relevance requires more effective extraction of information directly associated with patient outcome.MethodsDrawing from ideas in supervised text classification, we developed survClust, an outcome-weighted clustering algorithm for integrative molecular stratification focusing on patient survival. survClust was performed on 18 cancer types across multiple data modalities including somatic mutation, DNA copy number, DNA methylation, and mRNA, miRNA, and protein expression from the Cancer Genome Atlas study to identify novel prognostic subtypes.ResultsOur analysis identified the prognostic role of high tumor mutation burden with concurrently high CD8 T cell immune marker expression and the aggressive clinical behavior associated with CDKN2A deletion across cancer types. Visualization of somatic alterations, at a genome-wide scale (total mutation burden, mutational signature, fraction genome altered) and at the individual gene level, using circomap further revealed indolent versus aggressive subgroups in a pan-cancer setting.ConclusionsOur analysis has revealed prognostic molecular subtypes not previously identified by unsupervised clustering. The algorithm and tools we developed have direct utility toward patient stratification based on tumor genomics to inform clinical decision-making. The survClust software tool is available at https://github.com/arorarshi/survClust.

Highlights

  • Comprehensive molecular profiling has revealed somatic variations in cancer at genomic, epigenomic, transcriptomic, and proteomic levels

  • To overcome the current limitation of molecular clustering analysis, we developed the survClust algorithm as a supervised learning approach that aims to identify cancer subtypes that are not just molecularly distinct and prognostically significant

  • By differentially weighting the molecular features by the corresponding survival association in constructing the distance matrix, we show that survClust is more powerful for identifying subtypes that are directly relevant to stratify the outcome of interest, leading to substantially more distinct survival subgroups than those existing molecular subclasses obtained by unsupervised clustering

Read more

Summary

Introduction

Comprehensive molecular profiling has revealed somatic variations in cancer at genomic, epigenomic, transcriptomic, and proteomic levels. Conventional unsupervised clustering applied to a large patient population is inevitably driven by the dominant variation from major factors such as cellof-origin or histology Translation of these data into clinical relevance requires more effective extraction of information directly associated with patient outcome. Linking comprehensive molecular profiling data with patient outcome carries great promise in addressing such important clinical questions This requires innovative statistical and computational methods designed for integrative analysis of multidimensional data sets to model intra-tumor and inter-patient heterogeneity at genomic, epigenetic, and transcriptomic levels. Each of these molecular dimensions is correlated yet characterizes the disease in its own unique way. In order to arrive at a comprehensive molecular portrait of the tumor, multiple groups have proposed statistical and computational algorithms to synthesize various channels of information including methods developed by us (iCluster [1, 2]) and others (PARADIGM [3], CoCA [4], SNF [5], CIMLR [6]) to stratify

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.