Abstract

BackgroundRNA sequencing (RNA-seq) has become an indispensable tool to identify disease associated transcriptional profiles and determine the molecular underpinnings of diseases. However, the broad adaptation of the methodology into the clinic is still hampered by inconsistent results from different RNA-seq protocols and involves further evaluation of its analytical reliability using patient samples. Here, we applied two commonly used RNA-seq library preparation protocols to samples from acute leukemia patients to understand how poly-A-tailed mRNA selection (PA) and ribo-depletion (RD) based RNA-seq library preparation protocols affect gene fusion detection, variant calling, and gene expression profiling.ResultsOverall, the protocols produced similar results with consistent outcomes. Nevertheless, the PA protocol was more efficient in quantifying expression of leukemia marker genes and showed better performance in the expression-based classification of leukemia. Independent qRT-PCR experiments verified that the PA protocol better represented total RNA compared to the RD protocol. In contrast, the RD protocol detected a higher number of non-coding RNA features and had better alignment efficiency. The RD protocol also recovered more known fusion-gene events, although variability was seen in fusion gene predictions.ConclusionThe overall findings provide a framework for the use of RNA-seq in a precision medicine setting with limited number of samples and suggest that selection of the library preparation protocol should be based on the objectives of the analysis.

Highlights

  • RNA sequencing (RNA-seq) has become an indispensable tool to identify disease associated transcriptional profiles and determine the molecular underpinnings of diseases

  • Generation of a sequencing library for RNA-seq analysis is a complex, multi-step process and a potential source of significant variation [11, 12]. This process is most commonly carried out using poly-A-tailed mRNA selection (PA) or rRNA depletion (RD) to eliminate rRNAs that are naturally abundant in the sample and which would otherwise dominate the sequence data [13, 14]

  • Our analyses showed that PA and RD protocols produced consistent results and that patient heterogeneity represented the largest source of variation

Read more

Summary

Introduction

RNA sequencing (RNA-seq) has become an indispensable tool to identify disease associated transcriptional profiles and determine the molecular underpinnings of diseases. The technique has been insightful in understanding the pathogenesis and classification of leukemia [5, 6] It has enabled identification of a wide variety of clinically relevant predictive expression biomarkers [4, 7], fusion-genes and recurrent mutations [8, 9], expressed variants [5], and alternative splicing events [10] in Generation of a sequencing library for RNA-seq analysis is a complex, multi-step process and a potential source of significant variation [11, 12].

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.