On the relevance of technical variation due to building pools in microarray experiments.

Henrik Rudolf,Norbert Reinsch,Jens Vanselow,Martin Beye,Gerd Nuernberg,Dirk Koczan,Tanja Gempe,Kaspar Bienefeld,Gérard Leboulle

doi:10.1186/s12864-015-2055-6

Abstract

BackgroundPooled samples are frequently used in experiments measuring gene expression. In this method, RNA from different individuals sharing the same experimental conditions and explanatory variables is blended and their concentrations are jointly measured. As a matter of principle, individuals are represented in equal shares in each pool. However, some degree of disproportionality may arise from the limits of technical precision. As a consequence a special kind of technical error occurs, which can be modelled by a respective variance component. Previously published theory - allowing for variable pool sizes - has been applied to four microarray gene expression data sets from different species in order to assess the practical relevance of this type of technical error in terms of significance and size of this variance component.ResultsThe number of transcripts with a significant variance component due to imperfect blending was found to be 4329 (23 %) in mouse data and 7093 (49 %) in honey bees, but only 6 in rats and none whatsoever in human data. These results correspond to a false discovery rate of 5 % in each data set. The number of transcripts found to be differentially expressed between treatments was always higher when the blending error variance was neglected. Simulations clearly indicated overly-optimistic (anti-conservative) test results in terms of false discovery rates whenever this source of variability was not represented in the model.ConclusionsImperfect equality of shares when blending RNA from different individuals into joint pools of variable size is a source of technical variation with relevance for experimental design, practice at the laboratory bench and data analysis. Its potentially adverse effects, incorrect identification of differentially expressed transcripts and overly-optimistic significance tests, can be fully avoided, however, by the sound application of recently established theory and models for data analysis.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-015-2055-6) contains supplementary material, which is available to authorized users.

Highlights

Pooled samples are frequently used in experiments measuring gene expression
The number of transcripts with a significant variance component due to imperfect blending was found to be 4329 (23 %) in mouse data and 7093 (49 %) in honey bees, but only 6 in rats and none whatsoever in human data. These results correspond to a false discovery rate of 5 % in each data set
The number of transcripts found to be differentially expressed between treatments was always higher when the blending error variance was neglected

Summary

Introduction

Pooled samples are frequently used in experiments measuring gene expression. In this method, RNA from different individuals sharing the same experimental conditions and explanatory variables is blended and their concentrations are jointly measured. Published theory - allowing for variable pool sizes - has been applied to four microarray gene expression data sets from different species in order to assess the practical relevance of this type of technical error in terms of significance and size of this variance component. In the case of a balanced cross-classification the pool size effect must not explicitly appear in the model at all, though hypotheses on treatments remain unbiased, as shown in Rudolf et al [3].

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Genomics	Publication Date: Dec 1, 2015
Citations: 10	License type: cc-by

R Discovery Prime

R Discovery Prime

On the relevance of technical variation due to building pools in microarray experiments.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics

Lead the way for us

Similar Papers

High-throughput DNA methylation datasets for evaluating false discovery rate methodologies
N Asomaning ... K.J Archer
Computational Statistics and Data Analysis | VOL. 56
N Asomaning, et. al.N Asomaning ... K.J Archer
29 Oct 2011
Computational Statistics and Data Analysis | VOL. 56

WikiPathways for plants: a community pathway curation portal and a case study in rice and arabidopsis seed development networks.
Mamatha Hanumappa ... Denise Nemeth
Rice | VOL. 6
Mamatha Hanumappa, et. al.Mamatha Hanumappa ... Denise Nemeth
29 May 2013
Rice | VOL. 6

Concordant integrative gene set enrichment analysis of multiple large-scale two-sample expression data sets
Yinglei Lai ... Reza Modarres
BMC Genomics | VOL. 15
Yinglei Lai, et. al.Yinglei Lai ... Reza Modarres
01 Jan 2014
BMC Genomics | VOL. 15

Gene shaving using a sensitivity analysis of kernel based machine learning approach, with applications to cancer data.
Md Ashad Alam ... Md Ferdush Rahman
PloS one | VOL. 14
Md Ashad Alam, et. al.Md Ashad Alam ... Md Ferdush Rahman
23 May 2019
PloS one | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On the relevance of technical variation due to building pools in microarray experiments.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics