Abstract

A gene-by-gene mixed model analysis is a useful statistical method for assessing significance for microarray gene differential expression. While a large amount of data on thousands of genes are collected in a microarray experiment, the sample size for each gene is usually small, which could limit the statistical power of this analysis. In this report, we introduce an empirical Bayes (EB) approach for general variance component models applied to microarray data. Within a linear mixed model framework, the restricted maximum likelihood (REML) estimates of variance components of each gene are adjusted by integrating information on variance components estimated from all genes. The approach starts with a series of single-gene analyses. The estimated variance components from each gene are transformed to the “ANOVA components”. This transformation makes it possible to independently estimate the marginal distribution of each “ANOVA component.” The modes of the posterior distributions are estimated and inversely transformed to compute the posterior estimates of the variance components. The EB statistic is constructed by replacing the REML variance estimates with the EB variance estimates in the usual t statistic. The EB approach is illustrated with a real data example which compares the effects of five different genotypes of male flies on post-mating gene expression in female flies. In a simulation study, the ROC curves are applied to compare the EB statistic and two other statistics. The EB statistic was found to be the most powerful of the three. Though the null distribution of the EB statistic is unknown, a t distribution may be used to provide conservative control of the false positive rate.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call