Abstract

BackgroundHigh-throughput sequencing is rapidly becoming common practice in clinical diagnosis and cancer research. Many algorithms have been developed for somatic single nucleotide variant (SNV) detection in matched tumor-normal DNA sequencing. Although numerous studies have compared the performance of various algorithms on exome data, there has not yet been a systematic evaluation using PCR-enriched amplicon data with a range of variant allele fractions. The recently developed gold standard variant set for the reference individual NA12878 by the NIST-led “Genome in a Bottle” Consortium (NIST-GIAB) provides a good resource to evaluate admixtures with various SNV fractions.ResultsUsing the NIST-GIAB gold standard, we compared the performance of five popular somatic SNV calling algorithms (GATK UnifiedGenotyper followed by simple subtraction, MuTect, Strelka, SomaticSniper and VarScan2) for matched tumor-normal amplicon and exome sequencing data.ConclusionsWe demonstrated that the five commonly used somatic SNV calling methods are applicable to both targeted amplicon and exome sequencing data. However, the sensitivities of these methods vary based on the allelic fraction of the mutation in the tumor sample. Our analysis can assist researchers in choosing a somatic SNV calling method suitable for their specific needs.

Highlights

  • High-throughput sequencing is rapidly becoming common practice in clinical diagnosis and cancer research

  • Comparison of somatic point mutation calling methods in the benchmark amplicon sequencing data Our goal was to evaluate the performance of somatic single nucleotide variant (SNV) detection methods from matched tumor-normal samples using amplicon sequencing

  • The resulting BAM files were used as input to five somatic mutation calling methods, including (1) MuTect, (2) Genome Analysis Toolkit (GATK) UnifiedGenotyper [22] followed by simple subtraction, (3) SomaticSniper, (4) Strelka, and (5) VarScan2 (Methods and Table 1)

Read more

Summary

Introduction

High-throughput sequencing is rapidly becoming common practice in clinical diagnosis and cancer research. Many algorithms have been developed for somatic single nucleotide variant (SNV) detection in matched tumor-normal DNA sequencing. Several methods have been developed to enhance somatic mutation calling accuracy [7,8,9,10,11] These methods belong to two families: (1) independent analysis for tumor and normal datasets from an individual followed by SNV type classification using a statistical significance test or simple subtraction Agreement among different algorithms is relatively low [13,14], making selection of candidate SNVs for further validation difficult. This disagreement is likely partially due to different error models and prior assumptions underlying each algorithm

Objectives
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call