Analyzing gene expression data in terms of gene sets: methodological issues

Jelle J Goeman,Peter Bühlmann

doi:10.1093/bioinformatics/btm051

Abstract

Many statistical tests have been proposed in recent years for analyzing gene expression data in terms of gene sets, usually from Gene Ontology. These methods are based on widely different methodological assumptions. Some approaches test differential expression of each gene set against differential expression of the rest of the genes, whereas others test each gene set on its own. Also, some methods are based on a model in which the genes are the sampling units, whereas others treat the subjects as the sampling units. This article aims to clarify the assumptions behind different approaches and to indicate a preferential methodology of gene set testing. We identify some crucial assumptions which are needed by the majority of methods. P-values derived from methods that use a model which takes the genes as the sampling unit are easily misinterpreted, as they are based on a statistical model that does not resemble the biological experiment actually performed. Furthermore, because these models are based on a crucial and unrealistic independence assumption between genes, the P-values derived from such methods can be wildly anti-conservative, as a simulation experiment shows. We also argue that methods that competitively test each gene set against the rest of the genes create an unnecessary rift between single gene testing and gene set testing.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Analyzing gene expression data in terms of gene sets: methodological issues

Abstract

Talk to us

Similar Papers

More From: Bioinformatics

Lead the way for us

Journal: Bioinformatics	Publication Date: Feb 15, 2007
Citations: 787

Similar Papers

Comparative Evaluation of Set-Level Techniques in Microarray Classification
Jiri Klema ... Jakub Tolar
-
Jiri Klema, et. al.Jiri Klema ... Jakub Tolar
01 Jan 2010
01 Jan 2010

Comparative evaluation of set-level techniques in predictive classification of gene expression samples
Matěj Holec ... Jakub Tolar
BMC Bioinformatics | VOL. 13
Matěj Holec, et. al.Matěj Holec ... Jakub Tolar
01 Jun 2012
BMC Bioinformatics | VOL. 13

An Independent Filter for Gene Set Testing Based on Spectral Enrichment.
H Robert Frost ... Folkert W Asselbergs
IEEE/ACM transactions on computational biology and bioinformatics | VOL. 12
H Robert Frost, et. al.H Robert Frost ... Folkert W Asselbergs
01 Sep 2015
IEEE/ACM transactions on computational biology and bioinformatics | VOL. 12

Gene set analysis methods: a systematic comparison
Ravi Mathur ... Alison Motsinger-Reif
BioData Mining | VOL. 11
Ravi Mathur, et. al.Ravi Mathur ... Alison Motsinger-Reif
31 May 2018
BioData Mining | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Analyzing gene expression data in terms of gene sets: methodological issues

Abstract

Talk to us

Similar Papers

More From: Bioinformatics