Abstract

The accurate detection of ultralow allele frequency variants in DNA samples is of interest in both research and medical settings, particularly in liquid biopsies where cancer mutational status is monitored from circulating DNA. Next-generation sequencing (NGS) technologies employing molecular barcoding have shown promise but significant sensitivity and specificity improvements are still needed to detect mutations in a majority of patients before the metastatic stage. To address this we present analytical validation data for ERASE-Seq (Elimination of Recurrent Artifacts and Stochastic Errors), a method for accurate and sensitive detection of ultralow frequency DNA variants in NGS data. ERASE-Seq differs from previous methods by creating a robust statistical framework to utilize technical replicates in conjunction with background error modeling, providing a 10 to 100-fold reduction in false positive rates compared to published molecular barcoding methods. ERASE-Seq was tested using spiked human DNA mixtures with clinically realistic DNA input quantities to detect SNVs and indels between 0.05% and 1% allele frequency, the range commonly found in liquid biopsy samples. Variants were detected with greater than 90% sensitivity and a false positive rate below 0.1 calls per 10,000 possible variants. The approach represents a significant performance improvement compared to molecular barcoding methods and does not require changing molecular reagents.

Highlights

  • Next-generation sequencing (NGS) has opened the door to personalized medicine by drastically reducing the time and cost required to assess an individual’s nucleic acid composition[1]

  • Replicate experiments against the ERASE-Seq background model allow for an accurate categorization of false positive variant calls observed in sequencing data as either recurrent artifacts or stochastic errors

  • Recurrent artifacts are those false positive calls present in the single replicate data but eliminated by ERASE-Seq due to their presence in the background model. Stochastic errors are those false positives called in single replicate ERASE-Seq, but eliminated with increasing replicate number

Read more

Summary

Introduction

Next-generation sequencing (NGS) has opened the door to personalized medicine by drastically reducing the time and cost required to assess an individual’s nucleic acid composition[1]. This has allowed for the successful identification of germline mutations relevant to inherited genetic disorders[2], cancer predisposition[3], and drug sensitivity[4] among others. The respective companies did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.