Abstract

Somatic mutation calling from next-generation sequencing data remains a challenge due to the difficulties of distinguishing true somatic events from artifacts arising from PCR, sequencing errors or mis-mapping. Tumor cellularity or purity, sub-clonality and copy number changes also confound the identification of true somatic events against a background of germline variants. We have developed a heuristic strategy and software (http://www.qcmg.org/bioinformatics/qsnp/) for somatic mutation calling in samples with low tumor content and we show the superior sensitivity and precision of our approach using a previously sequenced cell line, a series of tumor/normal admixtures, and 3,253 putative somatic SNVs verified on an orthogonal platform.

Highlights

  • The declining cost of next-generation sequencing is enabling an increasing number of tumor sequencing studies [1,2,3], providing new insights into the mutations driving tumorigenesis

  • Despite this growing demand for accurate somatic mutation calls in cancer studies, mutation calling from next-generation sequencing data remains challenging

  • Many low purity tumor samples have been excluded from somatic mutation analysis to date due to the analytical challenges associated with accurately calling mutations in these samples and the expected high false negative rate

Read more

Summary

Introduction

The declining cost of next-generation sequencing is enabling an increasing number of tumor sequencing studies [1,2,3], providing new insights into the mutations driving tumorigenesis. These largescale efforts are redefining the role of known oncogenes and tumor suppressor genes, identifying new candidate driver genes and providing insights into the mutational mechanisms at play in different tumor types [4,5]. Many low purity tumor samples have been excluded from somatic mutation analysis to date due to the analytical challenges associated with accurately calling mutations in these samples and the expected high false negative rate. To keep the sensitivity of the analysis at desired levels, there is a risk of calling an increasing number of false positives

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call