Abstract

BackgroundRNA-seq is a next generation sequencing method with a wide range of applications including single nucleotide polymorphism (SNP) detection, splice junction identification, and gene expression level measurement. However, the RNA-seq sequence data can be biased during library constructions resulting in incorrect data for SNP, splice junction, and gene expression studies. Here, we developed new library preparation methods to limit such biases.ResultsA whole transcriptome library prepared for the SOLiD system displayed numerous read duplications (pile-ups) and gaps in known exons. The pile-ups and gaps of the whole transcriptome library caused a loss of SNP and splice junction information and reduced the quality of gene expression results. Further, we found clear sequence biases for both 5' and 3' end reads in the whole transcriptome library. To remove this bias, RNaseIII fragmentation was replaced with heat fragmentation. For adaptor ligation, T4 Polynucleotide Kinase (T4PNK) was used following heat fragmentation. However, its kinase and phosphatase activities introduced additional sequence biases. To minimize them, we used OptiKinase before T4PNK. Our study further revealed the specific target sequences of RNaseIII and T4PNK.ConclusionsOur results suggest that the heat fragmentation removed the RNaseIII sequence bias and significantly reduced the pile-ups and gaps. OptiKinase minimized the T4PNK sequence biases and removed most of the remaining pile-ups and gaps, thus maximizing the quality of RNA-seq data.ReviewersThis article was reviewed by Dr. A. Kolodziejczyk (nominated by Dr. Sarah Teichmann), Dr. Eugene Koonin, and Dr. Christoph Adami. For the full reviews, see the Reviewers' Comments section.

Highlights

  • RNA-seq is a generation sequencing method with a wide range of applications including single nucleotide polymorphism (SNP) detection, splice junction identification, and gene expression level measurement

  • RNA-seq library construction RNA-seq library construction methods vary among sequencing instruments

  • For the gene specific library, complementary DNA was fragmented after gene specific reverse transcription and amplification, and adaptors were ligated to the fragmented cDNA (Figure 1)

Read more

Summary

Introduction

RNA-seq is a generation sequencing method with a wide range of applications including single nucleotide polymorphism (SNP) detection, splice junction identification, and gene expression level measurement. The RNA-seq sequence data can be biased during library constructions resulting in incorrect data for SNP, splice junction, and gene expression studies. We developed new library preparation methods to limit such biases. Generation sequencing is revolutionizing biological data acquisition It can be used instead of many existing specialized measurement approaches. It can measure expression levels in any one disadvantage of RNA-seq is that its sequence data can be biased by library construction. We developed three new RNA-seq library preparation methods and compared them with the current RNA-seq library construction method for the SOLiD system. We identified the step in library preparation that caused the most pronounced bias and outlined alternative preparation techniques that can virtually eliminate this concern

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call