A Bayesian variable selection procedure to rank overlapping gene sets

Axel Skarman,Mohammad Shariati,Li Jiang,Luc Jans,Peter Sørensen

doi:10.1186/1471-2105-13-73

Axel Skarman, Mohammad Shariati + Show 3 more

Open Access

https://doi.org/10.1186/1471-2105-13-73

Copy DOI

Abstract

BackgroundGenome-wide expression profiling using microarrays or sequence-based technologies allows us to identify genes and genetic pathways whose expression patterns influence complex traits. Different methods to prioritize gene sets, such as the genes in a given molecular pathway, have been described. In many cases, these methods test one gene set at a time, and therefore do not consider overlaps among the pathways. Here, we present a Bayesian variable selection method to prioritize gene sets that overcomes this limitation by considering all gene sets simultaneously. We applied Bayesian variable selection to differential expression to prioritize the molecular and genetic pathways involved in the responses to Escherichia coli infection in Danish Holstein cows.ResultsWe used a Bayesian variable selection method to prioritize Kyoto Encyclopedia of Genes and Genomes pathways. We used our data to study how the variable selection method was affected by overlaps among the pathways. In addition, we compared our approach to another that ignores the overlaps, and studied the differences in the prioritization. The variable selection method was robust to a change in prior probability and stable given a limited number of observations.ConclusionsBayesian variable selection is a useful way to prioritize gene sets while considering their overlaps. Ignoring the overlaps gives different and possibly misleading results. Additional procedures may be needed in cases of highly overlapping pathways that are hard to prioritize.

Highlights

Genome-wide expression profiling using microarrays or sequence-based technologies allows us to identify genes and genetic pathways whose expression patterns influence complex traits
In this study we present a gene set approach based on the Bayesian variable selection method, known as Stochastic Search Variable Selection (SSVS) [11]
Analysis of Variance (ANOVA)-based testing of one gene set at a time was used as the reference method in comparison to the Bayesian variable selection method described in detail below

Summary

Introduction

Genome-wide expression profiling using microarrays or sequence-based technologies allows us to identify genes and genetic pathways whose expression patterns influence complex traits. GSEA can be implemented in a manner similar to a linear regression modeling approach that consists of three components: the incidence matrix linking genes to the gene set; the per-gene statistic vector, e.g., the t-statistic, and a per-set summing function. In this way, a large number of gene sets and overlapping gene sets can be viewed as a linear regression with a large number of highly collinear regression variables. A large number of gene sets and overlapping gene sets can be viewed as a linear regression with a large number of highly collinear regression variables This is a typical combinatorial and model selection problem. This becomes computationally demanding as the number of gene sets increases

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: May 3, 2012
Citations: 25	License type: cc-by

R Discovery Prime

R Discovery Prime

A Bayesian variable selection procedure to rank overlapping gene sets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Application of variable selection and dimension reduction on predictors of MSE\u2019s development
Habtamu Tilaye Wubetie
Journal of Big Data | VOL. 6
Habtamu Tilaye WubetieHabtamu Tilaye Wubetie
18 Feb 2019
Application of variable selection and dimension reduction on predictors of MSE\u2019s development
Habtamu Tilaye Wubetie

Bayesian variable selection based on clinical relevance weights in small sample studies-Application to colon cancer.
Sandrine Boulet ... Peter Thall
Statistics in Medicine | VOL. 38
Sandrine Boulet, et. al.Sandrine Boulet ... Peter Thall
22 Jan 2019
Statistics in Medicine | VOL. 38

Bayesian variable and model selection methods for genetic association studies
Brooke L Fridley
Genetic Epidemiology | VOL. 33
Brooke L FridleyBrooke L Fridley
10 Jul 2008
Genetic Epidemiology | VOL. 33

Application of statistical machine learning in biomarker selection
Ritwik Vashistha ... Shibing Deng
Scientific Reports | VOL. 13
Ritwik Vashistha, et. al.Ritwik Vashistha ... Shibing Deng
26 Oct 2023
Scientific Reports | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Bayesian variable selection procedure to rank overlapping gene sets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics