Abstract
A major goal in translational cancer research is to identify biological signatures driving cancer progression and metastasis. A common technique applied in genomics research is to cluster patients using gene expression data from a candidate prognostic gene set, and if the resulting clusters show statistically significant outcome stratification, to associate the gene set with prognosis, suggesting its biological and clinical importance. Recent work has questioned the validity of this approach by showing in several breast cancer data sets that “random” gene sets tend to cluster patients into prognostically variable subgroups. This work suggests that new rigorous statistical methods are needed to identify biologically informative prognostic gene sets. To address this problem, we developed Significance Analysis of Prognostic Signatures (SAPS) which integrates standard prognostic tests with a new prognostic significance test based on stratifying patients into prognostic subtypes with random gene sets. SAPS ensures that a significant gene set is not only able to stratify patients into prognostically variable groups, but is also enriched for genes showing strong univariate associations with patient prognosis, and performs significantly better than random gene sets. We use SAPS to perform a large meta-analysis (the largest completed to date) of prognostic pathways in breast and ovarian cancer and their molecular subtypes. Our analyses show that only a small subset of the gene sets found statistically significant using standard measures achieve significance by SAPS. We identify new prognostic signatures in breast and ovarian cancer and their corresponding molecular subtypes, and we show that prognostic signatures in ER negative breast cancer are more similar to prognostic signatures in ovarian cancer than to prognostic signatures in ER positive breast cancer. SAPS is a powerful new method for deriving robust prognostic biological signatures from clinically annotated genomic datasets.
Highlights
The identification of pathways that predict prognosis in cancer is important for enhancing our understanding of the biology of cancer progression and for identifying new therapeutic targets
We describe such a statistical and computational framework (Significance Analysis of Prognostic Signature (SAPS)) to allow robust and biologically informative prognostic gene sets to be identified in disease
The Ppure assesses the statistical significance of survival differences observed between two groups of patients stratified using a candidate gene set, and this test provides insight into the potential clinical utility of a gene set for stratifying patients into prognostically variable groups; this statistical test provides no information to compare the prognostic performance of the candidate gene set with randomly generated (‘‘biologically null’’) gene sets
Summary
The identification of pathways that predict prognosis in cancer is important for enhancing our understanding of the biology of cancer progression and for identifying new therapeutic targets. Recent work by Venet et al has questioned the validity of this assumption by showing that most random gene sets are able to separate breast cancer cases into groups exhibiting significant survival differences [15]. This suggests that it is not valid to infer the biologic significance of a gene set in breast cancer based on its association with breast cancer prognosis and further, that new rigorous statistical methods are needed to identify biologically informative prognostic pathways. The gene sets identified by SAPS provide new insight into the mechanisms driving breast cancer development and progression
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.