Abstract

SUMMARYA major challenge in cancer genomics is to identify genes with functional roles in cancer and uncover their mechanisms of action. We introduce an integrative framework that identifies cancer-relevant genes by pinpointing those whose interaction or other functional sites are enriched in somatic mutations across tumors. We derive analytical calculations that enable us to avoid time-prohibitive permutation-based significance tests, making it computationally feasible to simultaneously consider multiple measures of protein site functionality. Our accompanying software, PertInInt, combines knowledge about sites participating in interactions with DNA, RNA, peptides, ions, or small molecules with domain, evolutionary conservation, and gene-level mutation data. When applied to 10,037 tumor samples, PertInInt uncovers both known and newly predicted cancer genes, while additionally revealing what types of interactions or other functionalities are disrupted. PertInInt’s analysis demonstrates that somatic mutations are frequently enriched in interaction sites and domains and implicates interaction perturbation as a pervasive cancer-driving event.

Highlights

  • Large-scale, concerted oncogenomic consortia, such as the Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC), have sequenced an unprecedented number of tumor genomes from thousands of patients across tens of cancer types (International Cancer Genome Consortium et al, 2010; TCGA Research Network et al, 2013)

  • We find that while each source of information—interaction, domain, evolutionary conservation, and whole-gene mutation frequency—is individually predictive of cancer genes, PertInInt uncovers more comprehensive sets of cancer-relevant genes when considering all sources of information together

  • Overview of the PertInInt Framework PertInInt aggregates somatic mutational data observed across tumor samples and identifies for each gene whether certain types of its functional sites are enriched in somatic mutations and/or whether the gene exhibits a high mutation rate across its length

Read more

Summary

Introduction

Large-scale, concerted oncogenomic consortia, such as the Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC), have sequenced an unprecedented number of tumor genomes from thousands of patients across tens of cancer types (International Cancer Genome Consortium et al, 2010; TCGA Research Network et al, 2013). Computational analyses of these datasets promise a revolution in precision oncology with additional insights into the genetic underpinnings of a staggeringly complex and heterogeneous disease (Chin and Gray, 2008). Existing subgene-level methods have derived such protein site functionality information from analyses of evolutionary conservation

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.