Abstract

High-throughput DNA sequencing is revolutionizing the study of cancer and enabling the measurement of the somatic mutations that drive cancer development. However, the resulting sequencing datasets are large and complex, obscuring the clinically important mutations in a background of errors, noise, and random mutations. Here, we review computational approaches to identify somatic mutations in cancer genome sequences and to distinguish the driver mutations that are responsible for cancer from random, passenger mutations. First, we describe approaches to detect somatic mutations from high-throughput DNA sequencing data, particularly for tumor samples that comprise heterogeneous populations of cells. Next, we review computational approaches that aim to predict driver mutations according to their frequency of occurrence in a cohort of samples, or according to their predicted functional impact on protein sequence or structure. Finally, we review techniques to identify recurrent combinations of somatic mutations, including approaches that examine mutations in known pathways or protein-interaction networks, as well as de novo approaches that identify combinations of mutations according to statistical patterns of mutual exclusivity. These techniques, coupled with advances in high-throughput DNA sequencing, are enabling precision medicine approaches to the diagnosis and treatment of cancer.

Highlights

  • High-throughput DNA sequencing is revolutionizing the study of cancer and enabling the measurement of the somatic mutations that drive cancer development

  • Second is the problem of distinguishing the relatively small number of driver mutations that are responsible for the development and progression of cancer from the large number of passenger mutations that are irrelevant for the cancer phenotype

  • Navin and colleagues [40,41] found similar heterogeneity in the copy number aberration (CNA) present within different regions of breast tumors. These results demonstrate that a single sample from a tumor might not fully represent the complete landscape of somatic mutations present in the tumor

Read more

Summary

Data Method

Designed to detect low-frequency mutations in both whole-genome and exome data. Can be applied to both whole-genome and whole-exome data. ABSOLUTE [28] and ASCAT [29] are two algorithms that are used to infer both tumor purity and tumor ploidy from single-nucleotide polymorphism (SNP) array data Both methods may be modified to work with DNA-sequencing data [33], they model a tumor sample as consisting of only two populations: normal cells and tumor cells. Using high-coverage (188X) whole-genome DNA sequencing of a breast tumor, they inferred the proportion of tumor cells containing somatic SNVs and CNAs and grouped these proportions into several clusters, demonstrating different mutational events during the evolutionary progression from the founder cell of the tumor to the present tumor cell population. As a result of advances in DNAsequencing technology, the measurement of somatic

Method
Conclusions and future perspective
71. Venables JP
81. Bos JL
Findings
96. McCormick F
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call