Abstract

Short tandem repeats (STRs) are polymorphic genomic loci valuable for various applications such as research, diagnostics and forensics. However, their polymorphic nature also introduces noise during in vitro amplification, making them difficult to analyze. Although it is possible to overcome stutter noise by using amplification-free library preparation, such protocols are presently incompatible with single cell analysis and with targeted-enrichment protocols. To address this challenge, we have designed a method for direct measurement of in vitro noise. Using a synthetic STR sequencing library, we have calibrated a Markov model for the prediction of stutter patterns at any amplification cycle. By employing this model, we have managed to genotype accurately cases of severe amplification bias, and biallelic STR signals, and validated our model for several high-fidelity PCR enzymes. Finally, we compared this model in the context of a naïve STR genotyping strategy against the state-of-the-art on a benchmark of single cells, demonstrating superior accuracy.

Highlights

  • Short tandem repeats (STRs, known as microsatellites) are repetitive elements of 1–6 bp long that constitute ∼3% of the human genome

  • In order to study the stutter pattern as a function of amplification, we have designed and ordered a library of plasmids (Figure 1A), each containing a unique combination of STR type and length, spanning all naturally occurring mono and di repeats in the full spectrum of their natural genomic occurrence [18] (Supplemental Table S1)

  • High throughput sequencing opens a new frontier for STR science, both for basic [4,6] and for applicative research [21,22]

Read more

Summary

Introduction

Short tandem repeats (STRs, known as microsatellites) are repetitive elements of 1–6 bp long that constitute ∼3% of the human genome. They are best known for their highly mutative properties in vivo, which is due to polymerase slippage that results in repeat contraction/expansion. Their mutation rates vary dramatically, even low estimates are 3–4 orders of magnitude larger than of random point mutations, highlighting STRs as a tool of growing interest for various applications [1]. Due to technological advancements in single cell (SC) genomics, SC STR analysis became of research interest for applications such as cell lineage phylogenetic analysis within an organism [7,8] and for pre-implantation genetic diagnosis [9]

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call