Abstract

BackgroundMeasuring how gene expression changes in the course of an experiment assesses how an organism responds on a molecular level. Sequencing of RNA molecules, and their subsequent quantification, aims to assess global gene expression changes on the RNA level (transcriptome). While advances in high-throughput RNA-sequencing (RNA-seq) technologies allow for inexpensive data generation, accurate post-processing and normalization across samples is required to eliminate any systematic noise introduced by the biochemical and/or technical processes. Existing methods thus either normalize on selected known reference genes that are invariant in expression across the experiment, assume that the majority of genes are invariant, or that the effects of up- and down-regulated genes cancel each other out during the normalization.ResultsHere, we present a novel method, moose2, which predicts invariant genes in silico through a dynamic programming (DP) scheme and applies a quadratic normalization based on this subset. The method allows for specifying a set of known or experimentally validated invariant genes, which guides the DP. We experimentally verified the predictions of this method in the bacterium Escherichia coli, and show how moose2 is able to (i) estimate the expression value distances between RNA-seq samples, (ii) reduce the variation of expression values across all samples, and (iii) to subsequently reveal new functional groups of genes during the late stages of DNA damage. We further applied the method to three eukaryotic data sets, on which its performance compares favourably to other methods. The software is implemented in C++ and is publicly available from http://grabherr.github.io/moose2/.ConclusionsThe proposed RNA-seq normalization method, moose2, is a valuable alternative to existing methods, with two major advantages: (i) in silico prediction of invariant genes provides a list of potential reference genes for downstream analyses, and (ii) non-linear artefacts in RNA-seq data are handled adequately to minimize variations between replicates.

Highlights

  • Measuring how gene expression changes in the course of an experiment assesses how an organism responds on a molecular level

  • In summary, we present a novel method, moose2, and show that it corrects for systematic bias in RNA-seq expression data from a bacterial data set by normalizing expression values against a set of genes that were predicted as invariant in silico

  • When applied to more complex eukaryotic data sets, the method performs consistently as well as, or better than, other RNA-seq normalization methods, indicating that its algorithm is applicable to a wider set of organisms

Read more

Summary

Introduction

Measuring how gene expression changes in the course of an experiment assesses how an organism responds on a molecular level. Results: Here, we present a novel method, moose, which predicts invariant genes in silico through a dynamic programming (DP) scheme and applies a quadratic normalization based on this subset. Conclusions: The proposed RNA-seq normalization method, moose, is a valuable alternative to existing methods, with two major advantages: (i) in silico prediction of invariant genes provides a list of potential reference genes for downstream analyses, and (ii) non-linear artefacts in RNA-seq data are handled adequately to minimize variations between replicates. Assumptions underlying the latter methods are that: (i) the mean expression of genes, on which the normalization is computed, does not change across experiments; and (ii) that a single global scaling factor is valid over the entire dynamic range of expression. Our approach aims at satisfying the assumptions that: (i) there is a small, identifiable subset of genes that is not differentially expressed in the given experiment; and (ii) that a quadratic function approximates any non-linear characteristics of expression measurements across the dynamic range

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.