Sgof: An R Package for Multiple Testing Problems

Irene Castro-Conde,Jacobo,De Uña-Álvarez

doi:10.32614/rj-2014-027

Abstract

In this paper we present a new R package called sgof for multiple hypothesis testing. The principal aim of this package is to implement SGoF-type multiple testing methods, known to be more powerful than the classical false discovery rate (FDR) and family-wise error rate (FWER) based methods in certain situations, particularly when the number of tests is large. This package includes Binomial and Conservative SGoF and the Bayesian and Beta-Binomial SGoF multiple testing procedures, which are adaptations of the original SGoF method to the Bayesian setting and to possibly correlated tests, respectively. The sgof package also implements the Benjamini-Hochberg and Benjamini-Yekutieli FDR controlling procedures. For each method the package provides (among other things) the number of rejected null hypotheses, estimation of the corresponding FDR, and the set of adjusted p values. Some automatic plots of interest are implemented too. Two real data examples are used to illustrate how sgof works.

Highlights

IntroductionWe find many statistical inference problems in areas such as genomics and proteomics which involve the simultaneous testing of thousands of null hypotheses producing as a result a number of significant p values or effects (an increase in gene expression, or RNA/protein levels)
Multiple testing refers to any instance that involves the simultaneous testing of several null hypotheses, i.e., H01, H02, . . . , H0n.Nowadays, we find many statistical inference problems in areas such as genomics and proteomics which involve the simultaneous testing of thousands of null hypotheses producing as a result a number of significant p values or effects
We find many statistical inference problems in areas such as genomics and proteomics which involve the simultaneous testing of thousands of null hypotheses producing as a result a number of significant p values or effects

Summary

Introduction

We find many statistical inference problems in areas such as genomics and proteomics which involve the simultaneous testing of thousands of null hypotheses producing as a result a number of significant p values or effects (an increase in gene expression, or RNA/protein levels). These hypotheses may have complex and unknown dependence structures. In the multiple testing setting, a specific procedure for deciding which null hypotheses should be rejected is needed In this sense, the family-wise error rate (FWER) and the false discovery rate (FDR) have been proposed as suitable significance criteria to perform the multiple testing adjustment. The FDR and FWER based methods have the drawback of a rapidly decreasing power as the number of tests grows, being unable to detect even one effect in particular situations such as when there is a small to moderate proportion of weak effects

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: The R Journal	Publication Date: Jan 1, 2014
Citations: 26	License type: cc-by

R Discovery Prime

R Discovery Prime

Sgof: An R Package for Multiple Testing Problems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The R Journal

Lead the way for us

Similar Papers

Experimental and Statistical Considerations to Avoid False Conclusions in Proteomics Studies Using Differential In-gel Electrophoresis
Natasha A Karp ... Kathryn S Lilley
Molecular & Cellular Proteomics | VOL. 6
Natasha A Karp, et. al.Natasha A Karp ... Kathryn S Lilley
01 Aug 2007
Molecular & Cellular Proteomics | VOL. 6

Multiple Hypothesis Testing to Detect Lineages under Positive Selection that Affects Only a Few Sites
M Anisimova ... Z Yang
Molecular Biology and Evolution | VOL. 24
M Anisimova, et. al.M Anisimova ... Z Yang
13 Feb 2007
Molecular Biology and Evolution | VOL. 24

Incorporating the number of true null hypotheses to improve power in multiple testing: application to gene microarray data
Huey-Miin Hsueh ... James J Chen
Journal of Statistical Computation and Simulation | VOL. 77
Huey-Miin Hsueh, et. al.Huey-Miin Hsueh ... James J Chen
01 Sep 2007
Journal of Statistical Computation and Simulation | VOL. 77

Decision theory results for one-sided multiple comparison procedures
Arthur Cohen ... Harold B Sackrowitz
The Annals of Statistics | VOL. 33
Arthur Cohen, et. al.Arthur Cohen ... Harold B Sackrowitz
01 Feb 2005
The Annals of Statistics | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Sgof: An R Package for Multiple Testing Problems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The R Journal