Validation of differential gene expression algorithms: Application comparing fold-change estimation to hypothesis testing

Corey M Yanofsky,David R Bickel

doi:10.1186/1471-2105-11-63

Abstract

BackgroundSustained research on the problem of determining which genes are differentially expressed on the basis of microarray data has yielded a plethora of statistical algorithms, each justified by theory, simulation, or ad hoc validation and yet differing in practical results from equally justified algorithms. Recently, a concordance method that measures agreement among gene lists have been introduced to assess various aspects of differential gene expression detection. This method has the advantage of basing its assessment solely on the results of real data analyses, but as it requires examining gene lists of given sizes, it may be unstable.ResultsTwo methodologies for assessing predictive error are described: a cross-validation method and a posterior predictive method. As a nonparametric method of estimating prediction error from observed expression levels, cross validation provides an empirical approach to assessing algorithms for detecting differential gene expression that is fully justified for large numbers of biological replicates. Because it leverages the knowledge that only a small portion of genes are differentially expressed, the posterior predictive method is expected to provide more reliable estimates of algorithm performance, allaying concerns about limited biological replication. In practice, the posterior predictive method can assess when its approximations are valid and when they are inaccurate. Under conditions in which its approximations are valid, it corroborates the results of cross validation. Both comparison methodologies are applicable to both single-channel and dual-channel microarrays. For the data sets considered, estimating prediction error by cross validation demonstrates that empirical Bayes methods based on hierarchical models tend to outperform algorithms based on selecting genes by their fold changes or by non-hierarchical model-selection criteria. (The latter two approaches have comparable performance.) The posterior predictive assessment corroborates these findings.ConclusionsAlgorithms for detecting differential gene expression may be compared by estimating each algorithm's error in predicting expression ratios, whether such ratios are defined across microarray channels or between two independent groups.According to two distinct estimators of prediction error, algorithms using hierarchical models outperform the other algorithms of the study. The fact that fold-change shrinkage performed as well as conventional model selection criteria calls for investigating algorithms that combine the strengths of significance testing and fold-change estimation.

Highlights

Sustained research on the problem of determining which genes are differentially expressed on the basis of microarray data has yielded a plethora of statistical algorithms, each justified by theory, simulation, or ad hoc validation and yet differing in practical results from justified algorithms
The inability of RT-PCR to validate a microarray prediction of differential gene expression might indicate a problem with the statistical assumptions used to make the prediction, but it may instead refect a problem with cross hybridization due to the microarray platform
Participants in the MicroArray Quality Control (MAQC) project avoided such confounding between microarray platform effects and statistical method effects by quantifying the degree of overlap between gene lists produced by an algorithm on the basis of two independent data sets [8]

Summary

Introduction

Sustained research on the problem of determining which genes are differentially expressed on the basis of microarray data has yielded a plethora of statistical algorithms, each justified by theory, simulation, or ad hoc validation and yet differing in practical results from justified algorithms. A concordance method that measures agreement among gene lists have been introduced to assess various aspects of differential gene expression detection. This method has the advantage of basing its assessment solely on the results of real data analyses, but as it requires examining gene lists of given sizes, it may be unstable. A significant step forward, this way of comparing algorithms, like that of [10], requires examining gene lists of given sizes, which is why Chen et al [11] consider the concordance to be too unstable for use as an algorithm performance criterion

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Jan 28, 2010
Citations: 60	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

Validation of differential gene expression algorithms: Application comparing fold-change estimation to hypothesis testing

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Restoration of Liver Mass after Injury Requires Proliferative and Not Embryonic Transcriptional Patterns
Hasan H Otu ... Seth J Karp
Journal of Biological Chemistry | VOL. 282
Hasan H Otu, et. al.Hasan H Otu ... Seth J Karp
01 Apr 2007
Journal of Biological Chemistry | VOL. 282

Whole blood transcriptomics identifies gene expression associated with peanut allergy in infants at high risk.
Ashley L Devonshire ... Jacqueline A Pongracic
Clinical & Experimental Allergy | VOL. 51
Ashley L Devonshire, et. al.Ashley L Devonshire ... Jacqueline A Pongracic
15 Sep 2021
Clinical & Experimental Allergy | VOL. 51

Editor's evaluation: Comparative transcriptomic analysis reveals translationally relevant processes in mouse models of malaria
Urszula Krzych
-
Urszula KrzychUrszula Krzych
11 Aug 2021
11 Aug 2021

Efficient experimental design and analysis strategies for the detection of differential expression using RNA-Sequencing
José A Robles ... Susan R Wilson
BMC Genomics | VOL. 13
José A Robles, et. al.José A Robles ... Susan R Wilson
17 Sep 2012
BMC Genomics | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Validation of differential gene expression algorithms: Application comparing fold-change estimation to hypothesis testing

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics