Abstract

Nucleic acid sequence analyses are fundamental to all aspects of biological research, spanning aging, mitochondrial DNA (mtDNA) and cancer, as well as microbial and viral evolution. Over the past several years, significant improvements in DNA sequencing, including consensus sequence analysis, have proven invaluable for high-throughput studies. However, all current DNA sequencing platforms have limited utility for studies of complex mixtures or of individual long molecules, the latter of which is crucial to understanding evolution and consequences of single nucleotide variants and their combinations. Here we report a new technology termed LUCS (Long-molecule UMI-driven Consensus Sequencing), in which reads from third-generation sequencing are aggregated by unique molecular identifiers (UMIs) specific for each individual DNA molecule. This enables in-silico reconstruction of highly accurate consensus reads of each DNA molecule independent of other molecules in the sample. Additionally, use of two UMIs enables detection of artificial recombinants (chimeras). As proof of concept, we show that application of LUCS to assessment of mitochondrial genomes in complex mixtures from single cells was associated with an error rate of 1X10-4 errors/nucleotide. Thus, LUCS represents a major step forward in DNA sequencing that offers high-throughput capacity and high-accuracy reads in studies of long DNA templates and nucleotide variants in heterogenous samples.

Highlights

  • Every area of biological and biomedical research is rooted one way or another in understanding the precise order of nucleotides in DNA and RNA molecules, and how changes in these sequences subsequently alter downstream function and phenotype within and across generations

  • Homozygous Polg mice (PolgD257A/D257A) exhibit an elevated rate of accumulation of mitochondrial DNA (mtDNA) mutations, reaching ~13.6 mutations per mtDNA molecule during early adulthood, and serve as an excellent model for testing the sensitivity of single nucleotide variants (SNVs) calling in our sequencing strategy

  • Because inherent error rates in third-generation sequencing fluctuate around 10–15%, support fractions that are significantly lower than 80% raise suspicion over the reliability of an identified variant

Read more

Summary

Introduction

Every area of biological and biomedical research is rooted one way or another in understanding the precise order of nucleotides in DNA and RNA molecules, and how changes in these sequences subsequently alter downstream function and phenotype within and across generations. With the subsequent discovery that the use of radioactive or fluorescent probes to infer nucleotide www.aging-us.com sequences could be replaced with a luciferase-based pyrophosphate synthesis method referred to as pyrosequencing [6], commercial next-generation sequencing (NGS) was born. Continued improvement in this technology, which relied on specially designed machines capable of performing tremendous numbers of sequencing reactions in parallel, enabled rapid development of high-throughput DNA sequencing that defined the era of second-generation sequencing. The ability to perform single-molecule sequencing (SMS), and minimize biases and errors inherent in DNA amplification, heralded the transition to third-generation sequencing [7, 8]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call