Abstract

BackgroundIn microarray experiments with small sample sizes, it is a challenge to estimate p-values accurately and decide cutoff p-values for gene selection appropriately. Although permutation-based methods have proved to have greater sensitivity and specificity than the regular t-test, their p-values are highly discrete due to the limited number of permutations available in very small sample sizes. Furthermore, estimated permutation-based p-values for true nulls are highly correlated and not uniformly distributed between zero and one, making it difficult to use current false discovery rate (FDR)-controlling methods.ResultsWe propose a model-based information sharing method (MBIS) that, after an appropriate data transformation, utilizes information shared among genes. We use a normal distribution to model the mean differences of true nulls across two experimental conditions. The parameters of the model are then estimated using all data in hand. Based on this model, p-values, which are uniformly distributed from true nulls, are calculated. Then, since FDR-controlling methods are generally not well suited to microarray data with very small sample sizes, we select genes for a given cutoff p-value and then estimate the false discovery rate.ConclusionSimulation studies and analysis using real microarray data show that the proposed method, MBIS, is more powerful and reliable than current methods. It has wide application to a variety of situations.

Highlights

  • In microarray experiments with small sample sizes, it is a challenge to estimate p-values accurately and decide cutoff p-values for gene selection appropriately

  • The fold change (FC) method does not take into account the variability with gene expression measurements, or, even worse, it assumes that the variability for all expression measurements is the same, which is likely to be false for most gene expression experiments

  • For Model-based information sharing (MBIS) and t-test without permutation, we know the distribution of all nulls and, we can estimate the number of false positives (Est.FP) for a given cutoff p-value

Read more

Summary

Introduction

In microarray experiments with small sample sizes, it is a challenge to estimate p-values accurately and decide cutoff p-values for gene selection appropriately. Permutation-based methods have proved to have greater sensitivity and specificity than the regular t-test, their p-values are highly discrete due to the limited number of permutations available in very small sample sizes. The sample sizes are small compared to the number of genes to be investigated, making estimation of standard error for statistical tests very inaccurate. Thousands of hypotheses (one corresponding to each gene or set of genes, in general) are tested at once, which greatly increases the probability of Type I error. This problem is called the “multiple comparison problem” in hypothesis testing.

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call