Abstract

Cancers are caused by the accumulation of genomic alterations. Driver mutations are required for the cancer phenotype, whereas passenger mutations are irrelevant to tumor development and accumulate through DNA replication. A major challenge facing the field of cancer genome sequencing is to identify cancer-associated genes with mutations that drive the cancer phenotype. Here, we describe a powerful and flexible statistical framework for identifying driver genes and driver signaling pathways in cancer genome-sequencing studies. Biological knowledge of the mutational process in tumors is fully integrated into our statistical models and includes such variables as the length of protein-coding regions, transcript isoforms, variation in mutation types, differences in background mutation rates, the redundancy of genetic code, and multiple mutations in one gene. This framework provides several significant features that are not addressed or naively obtained by previous methods. In particular, on the observation of low prevalence of somatic mutations in individual tumors, we propose a heuristic strategy to estimate the mixture proportion of chi-square distribution of likelihood ratio test (LRT) statistics. This provides significantly increased statistical power compared to regular LRT. Through a combination of simulation and analysis of TCGA cancer sequencing study data, we demonstrate high accuracy and sensitivity in our methods. Our statistical methods and several auxiliary bioinformatics tools have been incorporated into a computational tool, DrGaP. The newly developed tool is immediately applicable to cancer genome-sequencing studies and will lead to a more complete identification of altered driver genes and driver signaling pathways in cancer.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.