Rank-based Bayesian variable selection for genome-wide transcriptomic analyses.

Emilie Eliseussen,Valeria Vitelli,Thomas Fleischer

doi:10.1002/sim.9524

Emilie Eliseussen, Valeria Vitelli + Show 1 more

Open Access

https://doi.org/10.1002/sim.9524

Copy DOI

Abstract

Variable selection is crucial in high-dimensional omics-based analyses, since it is biologically reasonable to assume only a subset of non-noisy features contributes to the data structures. However, the task is particularly hard in an unsupervised setting, and a priori ad hoc variable selection is still a very frequent approach, despite the evident drawbacks and lack of reproducibility. We propose a Bayesian variable selection approach for rank-based unsupervised transcriptomic analysis. Making use of data rankings instead of the actual continuous measurements increases the robustness of conclusions when compared to classical statistical methods, and embedding variable selection into the inferential tasks allows complete reproducibility. Specifically, we develop a novel extension of the Bayesian Mallows model for variable selection that allows for a full probabilistic analysis, leading to coherent quantification of uncertainties. Simulation studies demonstrate the versatility and robustness of the proposed method in a variety of scenarios, as well as its superiority with respect to several competitors when varying the data dimension or data generating process. We use the novel approach to analyze genome-wide RNAseq gene expression data from ovarian cancer patients: several genes that affect cancer development are correctly detected in a completely unsupervised fashion, showing the usefulness of the method in the context of signature discovery for cancer genomics. Moreover, the possibility to also perform uncertainty quantification plays a key role in the subsequent biological investigation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Statistics in medicine	Publication Date: Jul 18, 2022
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Rank-based Bayesian variable selection for genome-wide transcriptomic analyses.

Abstract

Talk to us

Similar Papers

More From: Statistics in medicine

Lead the way for us

Similar Papers

Gaussian process regression for survival time prediction with genome-wide gene expression.
Aaron J Molstad ... Wei Sun
Biostatistics (Oxford, England) | VOL. 22
Aaron J Molstad, et. al.Aaron J Molstad ... Wei Sun
11 Jul 2019
Biostatistics (Oxford, England) | VOL. 22

An Integrated Approach of Learning Genetic Networks From Genome-Wide Gene Expression Data Using Gaussian Graphical Model and Monte Carlo Method.
Haitao Zhao ... Zhong-Hui Duan
Bioinformatics and Biology Insights | VOL. 17
Haitao Zhao, et. al.Haitao Zhao ... Zhong-Hui Duan
01 Jan 2023
Bioinformatics and Biology Insights | VOL. 17

PcaGoPromoter - An R Package for Biological and Regulatory Interpretation of Principal Components in Genome-Wide Gene Expression Data
Morten Hansen ... Jesper Thorvald Troelsen
PLoS ONE | VOL. 7
Morten Hansen, et. al.Morten Hansen ... Jesper Thorvald Troelsen
27 Feb 2012
PLoS ONE | VOL. 7

Sure independence screening in the presence of missing data
Adriano Zanin Zambom ... Gregory J Matthews
Statistical Papers | VOL. 62
Adriano Zanin Zambom, et. al.Adriano Zanin Zambom ... Gregory J Matthews
29 May 2019
Statistical Papers | VOL. 62

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Rank-based Bayesian variable selection for genome-wide transcriptomic analyses.

Abstract

Talk to us

Similar Papers

More From: Statistics in medicine