Detecting differential expression in microarray data: comparison of optimal procedures.

Elena Perelman,Alexander Ploner,Stefano Calza,Yudi Pawitan

doi:10.1186/1471-2105-8-28

Elena Perelman, Alexander Ploner + Show 2 more

Open Access

https://doi.org/10.1186/1471-2105-8-28

Copy DOI

Abstract

BackgroundMany procedures for finding differentially expressed genes in microarray data are based on classical or modified t-statistics. Due to multiple testing considerations, the false discovery rate (FDR) is the key tool for assessing the significance of these test statistics. Two recent papers have generalized two aspects: Storey et al. (2005) have introduced a likelihood ratio test statistic for two-sample situations that has desirable theoretical properties (optimal discovery procedure, ODP), but uses standard FDR assessment; Ploner et al. (2006) have introduced a multivariate local FDR that allows incorporation of standard error information, but uses the standard t-statistic (fdr2d). The relationship and relative performance of these methods in two-sample comparisons is currently unknown.MethodsUsing simulated and real datasets, we compare the ODP and fdr2d procedures. We also introduce a new procedure called S2d that combines the ODP test statistic with the extended FDR assessment of fdr2d.ResultsFor both simulated and real datasets, fdr2d performs better than ODP. As expected, both methods perform better than a standard t-statistic with standard local FDR. The new procedure S2d performs as well as fdr2d on simulated data, but performs better on the real data sets.ConclusionThe ODP can be improved by including the standard error information as in fdr2d. This means that the optimality enjoyed in theory by ODP does not hold for the estimated version that has to be used in practice. The new procedure S2d has a slight advantage over fdr2d, which has to be balanced against a significantly higher computational effort and a less intuititive test statistic.

Highlights

Many procedures for finding differentially expressed genes in microarray data are based on classical or modified t-statistics
Building on the Neyman-Pearson lemma for testing an individual hypothesis, the author shows that an extension of the likelihood ratio test statistic for multiple parallel hypotheses is the optimal procedure for deciding whether any specific gene is differentially expressed (DE): for any fixed number of false positive results, ODP will identify the maximum number of true positives
In order to compare different fdr procedures, we summarize their results via operating characteristics (OC) curves: for each procedure, we sort the groups of genes as described above by their local fdr, and compute the corresponding global false discovery rate (FDR) as cumulative mean of the local fdrs from the smallest to the largest

Summary

Introduction

Many procedures for finding differentially expressed genes in microarray data are based on classical or modified t-statistics. Due to multiple testing considerations, the false discovery rate (FDR) is the key tool for assessing the significance of these test statistics. The need to identify a possibly very small number of regulated genes among the 10,000s of sequences found on modern microarray chips, based on tens to hundreds of biological samples, has led to a plethora of different methods. Many competing methods for detecting DE exist, and even attempts at validation on data sets with known mRNA composition [4] cannot offer definitive guidelines. In this context, the introduction of the so-called optimal discovery procedure (ODP, [5]) constitutes a major conceptual achievement. The ODP establishes a theoretical optimum for detecting DE against which any other method can be measured

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Jan 26, 2007
Citations: 31	License type: cc-by

R Discovery Prime

R Discovery Prime

Detecting differential expression in microarray data: comparison of optimal procedures.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Comparison of methods for identifying differentially expressed genes across multiple conditions from microarray data.
Yuande Tan ... Yin Liu
Bioinformation | VOL. 7
Yuande Tan, et. al.Yuande Tan ... Yin Liu
21 Dec 2011
Bioinformation | VOL. 7

A unified approach to false discovery rate estimation
Korbinian Strimmer
BMC Bioinformatics | VOL. 9
Korbinian StrimmerKorbinian Strimmer
09 Jul 2008
BMC Bioinformatics | VOL. 9

A note on using permutation-based false discovery rate estimates to compare different analysis methods for microarray data
Yang Xie ... Arkady B Khodursky
Bioinformatics | VOL. 21
Yang Xie, et. al.Yang Xie ... Arkady B Khodursky
27 Sep 2005
Bioinformatics | VOL. 21

A New Test Statistic Based on Shrunken Sample Variance for Identifying Differentially Expressed Genes in Small Microarray Experiments
Akihiro Hirakawa ... Isao Yoshimura
Bioinformatics and Biology Insights | VOL. 2
Akihiro Hirakawa, et. al.Akihiro Hirakawa ... Isao Yoshimura
01 Jan 2008
Bioinformatics and Biology Insights | VOL. 2

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Detecting differential expression in microarray data: comparison of optimal procedures.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics