Abstract

Thermostable group II intron reverse transcriptases (TGIRTs) with high fidelity and processivity have been used for a variety of RNA sequencing (RNA-seq) applications, including comprehensive profiling of whole-cell, exosomal, and human plasma RNAs; quantitative tRNA-seq based on the ability of TGIRT enzymes to give full-length reads of tRNAs and other structured small ncRNAs; high-throughput mapping of post-transcriptional modifications; and RNA structure mapping. Here, we improved TGIRT-seq methods for comprehensive transcriptome profiling by rationally designing RNA-seq adapters that minimize adapter dimer formation. Additionally, we developed biochemical and computational methods for remediating 5′- and 3′-end biases, the latter based on a random forest regression model that provides insight into the contribution of different factors to these biases. These improvements, some of which may be applicable to other RNA-seq methods, increase the efficiency of TGIRT-seq library construction and improve coverage of very small RNAs, such as miRNAs. Our findings provide insight into the biochemical basis of 5′- and 3′-end biases in RNA-seq and suggest general approaches for remediating biases and decreasing adapter dimer formation.

Highlights

  • Thermostable group II intron reverse transcriptase (RT) (TGIRTs) enzymes to give full-length reads of tRNAs and other structured small ncRNAs; high-throughput mapping of post-transcriptional modifications; and RNA structure mapping

  • Analysis of TGIRT-seq datasets obtained for fragmented Universal Human Reference RNA (UHRR) or plasma DNA suggested that a major source of sequence bias is the DNA ligation step using the thermostable 5′ App DNA/RNA ligase, which has a preference for A or C and against U/T at position −3 from the 3′ end of the acceptor nucleic acid[7,28,29]

  • We noticed that the R2R adapter used previously for TGIRT-seq has a C-residue at position −3 from its 3′ end, which favors the formation of R1R-R2R adapter dimers during the ligation step (Fig. 1B)

Read more

Summary

Introduction

TGIRT enzymes to give full-length reads of tRNAs and other structured small ncRNAs; high-throughput mapping of post-transcriptional modifications; and RNA structure mapping. We developed biochemical and computational e methods for remediating 5′- and 3′-end biases, the latter based on a random forest regression model that provides insight into the contribution of different factors to these biases These improvements, t some of which may be applicable to other RNA-seq methods, increase the efficiency of TGIRT-seq library construction and improve coverage of very small RNAs, such as miRNAs. Our findings provide insight c into the biochemical basis of 5′- and 3′-end biases in RNA-seq and suggest general approaches for remediating biases and decreasing adapter dimer formation. A weakness of most current RNA-seq methods is their use of a retroviral reverse transcriptase (RT) to copy target RNAs into cDNAs for sequencing on various high-throughput DNA sequencing platforms[4]. Unlike retroviral RTs, which evolved to help retroviruses evade host defenses by introducing and propnagating mutational variations[5], group II intron RTs evolved to function in retrohoming, a retrotranposition mechanism that requires faithful synthesis of a full-length cDNA of a long, highly structured group II intron

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call