Abstract

In contrast to traditional methods like real-time polymerase chain reaction, next-generation sequencing (NGS), and especially small RNA-seq, enables the untargeted investigation of the whole small RNAome, including microRNAs (miRNAs) but also a multitude of other RNA species. With the promising application of small RNAs as biofluid-based biomarkers, small RNA-seq is the method of choice for an initial discovery study. However, the presentation of specific quality aspects of small RNA-seq data varies significantly between laboratories and is lacking a common (minimal) standard. The miRNA NGS Discovery pipeline (miND) aims to bridge the gap between wet lab scientist and bioinformatics with an easy to setup configuration sheet and an automatically generated comprehensive report that contains all essential qualitative and quantitative results that should be reported. Besides the standard steps like preprocessing, mapping, visualization, and quantification of reads, the pipeline also incorporates differential expression analysis when given the appropriate information regarding sample groups. Although miND has a focus on miRNAs, other RNA species like tRNAs, piRNA, snRNA, or snoRNA are included and mapping statistics are available for further analysis. miND has been developed and tested on a multitude of data sets with various RNA sources (tissue, plasma, extracellular vesicles, urine, etc.) and different species. miND is a Snakemake based pipeline and thus incorporates all advantages using a flexible workflow management system. Reference databases are downloaded, prepared and built with an included (but separate) workflow and thus can easily be updated to the most recent version but also stored for reproducibility. In conclusion, the miND pipeline aims to streamline the bioinformatics processing of small RNA-seq data by standardizing the processing from raw data to a final, comprehensive and reproducible report.

Highlights

  • Small RNA-seq has been a well-established tool for the quantification of short RNA molecules like microRNAs in various biofluids (Murillo et al, 2019)

  • This makes them interesting targets as biomarkers in liquid biopsy (Larrea et al, 2016)

  • With the need for a standardized report that contains all relevant data and an initial statistical analysis, we developed a small RNA-seq data processing pipeline that provides one centralized report with all relevant information, and bridges the gap between biologists and bioinformaticians with very easy to prepare data submission files as input and a detailed and well documented and interactive report as output

Read more

Summary

Introduction

Small RNA-seq has been a well-established tool for the quantification of short RNA molecules like microRNAs (miRNAs) in various biofluids (Murillo et al, 2019) Those short RNA molecules (17 to 25nt) play an important role in the cellular regulation of gene expression by interacting with specific complementary sites in targeted messenger RNAs (mRNAs). For example the liver specific miR-122-5p was shown to be a suitable marker for liver injury when measured in serum or plasma (Llewellyn et al, 2021) and as part of a miRNA expression signature can even be used to predict recovery after liver resection (Starlinger et al, 2019) This makes them interesting targets as biomarkers in liquid biopsy (Larrea et al, 2016). Wrapper scripts for startup of the pipeline on Linux based systems are provided which can be adapted for the use on different platforms

Methods
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call