Abstract

Abstract Accurate analysis of quantitative NGS data is critical for low frequency variant detection, identification of differentially expressed transcripts, and correct diagnosis and patient care in a clinical NGS setting. Two major factors affecting sequencing accuracy are 1) PCR duplication arising from amplification of library molecules; and 2) errors introduced during library preparation and actual sequencing on the flow cell. Standard practice for identification and removal of PCR duplicates relies on aligning reads to the same genomic coordinates. However, this approach can not differentiate between PCR duplicates and reads originating from unique molecules with identical ends. The most effective method for error correction is Duplex Sequencing, which utilizes UMIs to tag both strands of each individual DNA duplex followed by building consensus sequences. Unfortunately, this approach is labor intensive and cost-prohibitive for complex genomes and large target panels. Therefore, a simple and reliable approach for duplicate removal and error correction would greatly facilitate the wider adoption of NGS technology for diagnostics and clinical applications. In this study, we incorporate UMIs into UDI systems and assess the effect on the accuracy of quantitative sequencing assays. We first study the effectiveness of various computational methods to account for UMIs and remove base-calling errors introduced during sequencing. We then analyze the utility of UMIs for 1) unique and low abundance transcript identification and accurate transcript quantification; and 2) error correction in low frequency variant detection in genomic sequencing from both high quality cell line DNA and low quality FFPE DNA. In addition, we demonstrate that combining unique dual sample indexing with UMI molecular barcoding further improves data analysis accuracy, especially on patterned flow cells. Our approach involves a simple new UMI-containing UDI adaptor design that can also be applied to other sequencing methods and platforms. Citation Format: Pingfang Liu, Keerthana Krishnan, Camille X. Devoe, Bradley W. Langhorst, Eileen Dimalanta, Theodore B. Davis. Incorporation of unique molecular identifiers (UMIs) into unique dual sample indexing (UDI) improves the accuracy of quantitative next generation sequencing [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2019; 2019 Mar 29-Apr 3; Atlanta, GA. Philadelphia (PA): AACR; Cancer Res 2019;79(13 Suppl):Abstract nr 3526.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call