Abstract

Accurate allele frequencies are important for measuring subclonal heterogeneity and clonal evolution. Deep-targeted sequencing data can contain PCR duplicates, inflating perceived read depth. Here we adapted the Illumina TruSeq Custom Amplicon kit to include single molecule tagging (SMT) and show that SMT-identified duplicates arise from PCR. We demonstrate that retention of PCR duplicate reads can imply clonal evolution when none exists, while their removal effectively controls the false positive rate. Additionally, PCR duplicates alter estimates of subclonal heterogeneity in tumor samples. Our method simplifies PCR duplicate identification and emphasizes their removal in studies of tumor heterogeneity and clonal evolution.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-014-0420-4) contains supplementary material, which is available to authorized users.

Highlights

  • Sequencing technologies have recently allowed for an unprecedented window into the process of cancer evolution [1]

  • Adapting Illumina TruSeq to use single molecule tagging The Illumina TruSeq Custom Amplicon Kit is a multiplex system for targeted sequencing that allows for approximately 1,500 amplicons to be sequenced at the same time

  • Our study demonstrates the importance of identifying and removing PCR duplicates in studies of clonal evolution and cancer heterogeneity and provides a simple modification to a commercially available kit that allows for effective identification of PCR duplicates in deep-targeted sequencing

Read more

Summary

Introduction

Sequencing technologies have recently allowed for an unprecedented window into the process of cancer evolution [1]. Deep-targeted sequencing [5] is frequently used in studying how these clones change over time. This technology has provided insights into subclonal phylogenetic structures in cancer [6] and mutational patterns that occur and are selected for during tumor progression [7,8,9] and in response to treatment [10]. Patient treatment can be informed by subclonal heterogeneity [11,12], and deeptargeted sequencing can be used to track recurrence and evolution by the sequencing of circulating tumor DNA [13]. Deep-targeted sequencing is well-suited to provide accurate frequency estimates because each read can provide independent information. Obtaining accurate estimates of allele frequency can be complicated

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call