Abstract

Abstract Cancer genomic profiles created by analysis of targeted Next Generation Sequencing (NGS) panels is emerging as a powerful tool for making informed clinical decisions. Of the critical informatics challenges to address, accurate mutation calls and allele frequency estimations after accounting for PCR-mediated artifacts are debated. The process of sample preparation for NGS sequencing involves amplification by PCR. While PCR is relatively error-free, mistakes early in DNA synthesis can be compounded, driving detection of spurious mutations and having an adverse impact on clinical reporting. Previous reports have addressed the utility of detecting and removing PCR duplicate reads in Mendelian applications but have rarely examined its use with targeted NGS panels. We performed deduplication with 3 widely used tools (Samtools, Samblaster, and Picard) to understand sensitivity to call low frequency alleles and any impact on false positive/negative rates. Furthermore, we evaluated effects of duplicate removal on targeted panels of varying sizes and effects of sample matrices using replicates of 7 verified reference samples with several digitally confirmed alleles with frequencies ranging from 1-5%. It is not practical to perform deduplication on PCR-enriched panels, therefore, we assessed 3 different hybrid enrichment panels of varying size (387kb, 1.3Mb, and 54Mb). Deduplication by Picard resulted in a greater decrease in the mean depth for the smaller panels (32-59%) compared to Exome (15%), showing that higher molecular diversity lowers duplication rates. Uniformity (percent of ROI with depth within 20% of the mean depth) improved 6-18% after deduplication for the smaller panels, but only 1% for the Exome. Independent of panel size, about 32% of the total reads were marked as duplicates, reducing the power to call low frequency variants by 18%. Importantly, after added sequencing 95-96% of onco-specific variants were detected post-deduplication with a lower limit of detection of 3% compared to 2.5% pre-removal. For low-quality DNA samples we find no benefit in added sequencing for any panel. We also generally observed higher sensitivity (0% to 10% for SNV and -3% to 3% for indels) post deduplication. Molecular diversity also varies by sample type. Intact DNA show higher molecular diversity and lower duplication rates than degraded FFPE samples. We profiled mixed-quality FFPE samples (n = 85), good-quality fresh frozen samples (n = 3), and NA12878 (n = 1) on our smallest panel and noted duplication rates of 64±14%, 40±5% and 30% respectively. On average, deduplication cut the number of SNV calls by 17.4%, with the FFPE samples affected the most (2-88%). From our analysis we recommend performing deduplication during analysis of targeted panels. While we observed the most benefit for smaller panels with low uniformity, improved variant sensitivity was seen regardless of panel size. During experimental design, we advise a worksheet to guide deduplication decisions. Citation Format: Jeran Stratford, Gunjan Hariani, Jeff Jasper, Chad Brown, Wendell Jones, Victor J. Weigman. Impact of duplicate removal on low frequency NGS somatic variant calling. [abstract]. In: Proceedings of the 107th Annual Meeting of the American Association for Cancer Research; 2016 Apr 16-20; New Orleans, LA. Philadelphia (PA): AACR; Cancer Res 2016;76(14 Suppl):Abstract nr 5276.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call