Multiple-input multiple-output causal strategies for gene selection

Gianluca Bontempi,Christos Sotiriou,John Quackenbush,Christine Desmedt,Benjamin Haibe-Kains

doi:10.1186/1471-2105-12-458

Gianluca Bontempi, Christos Sotiriou + Show 3 more

Open Access

PDF Available

https://doi.org/10.1186/1471-2105-12-458

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

BackgroundTraditional strategies for selecting variables in high dimensional classification problems aim to find sets of maximally relevant variables able to explain the target variations. If these techniques may be effective in generalization accuracy they often do not reveal direct causes. The latter is essentially related to the fact that high correlation (or relevance) does not imply causation. In this study, we show how to efficiently incorporate causal information into gene selection by moving from a single-input single-output to a multiple-input multiple-output setting.ResultsWe show in synthetic case study that a better prioritization of causal variables can be obtained by considering a relevance score which incorporates a causal term. In addition we show, in a meta-analysis study of six publicly available breast cancer microarray datasets, that the improvement occurs also in terms of accuracy. The biological interpretation of the results confirms the potential of a causal approach to gene selection.ConclusionsIntegrating causal information into gene selection algorithms is effective both in terms of prediction accuracy and biological interpretation.

Highlights

Traditional strategies for selecting variables in high dimensional classification problems aim to find sets of maximally relevant variables able to explain the target variations
The first one is based on a number of synthetic datasets generated by simulating a causal Bayesian network while the second relies on public microarray breast cancer datasets to assess the approach in a real data setting
Note that this causal structure aims to represent in a very simplified manner a stochastic dependency characterized by a number of indirect and direct causes, a latent non measurable variable, one observable primary target, two secondary targets, a set of additional effects and a number of independent and irrelevant variables

Summary

Introduction

Traditional strategies for selecting variables in high dimensional classification problems aim to find sets of maximally relevant variables able to explain the target variations. If these techniques may be effective in generalization accuracy they often do not reveal direct causes. A drawback is that ranking relies on univariate terms and as such it cannot take into consideration higher-order interaction terms or redundancy between features [2]. Another limitation is that ranking techniques are not able to distinguish between causes and mechanisms associated with disease and appropriate therapeutic targets

Methods

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Nov 25, 2011
Citations: 7	License type: CC BY 2.0

R Discovery Prime

Multiple-input multiple-output causal strategies for gene selection

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Gene selection algorithms for microarray data based on least squares support vector machine.
E Ke Tang ... Pn Suganthan
BMC Bioinformatics | VOL. 7
E Ke Tang, et. al.E Ke Tang ... Pn Suganthan
27 Feb 2006
BMC Bioinformatics | VOL. 7

WERFE: A Gene Selection Algorithm Based on Recursive Feature Elimination and Ensemble Strategy.
Qi Chen ... Zhaopeng Meng
Frontiers in Bioengineering and Biotechnology | VOL. 8
Qi Chen, et. al.Qi Chen ... Zhaopeng Meng
28 May 2020
Frontiers in Bioengineering and Biotechnology | VOL. 8

Gene ontology based quantitative index to select functionally diverse genes
Sushmita Paul ... Pradipta Maji
International Journal of Machine Learning and Cybernetics | VOL. 5
Sushmita Paul, et. al.Sushmita Paul ... Pradipta Maji
27 Sep 2012
International Journal of Machine Learning and Cybernetics | VOL. 5

A Survey on Gene Selection for Microarray Cancer Classification Based on Soft Computing Techniques
S.Divya Bharathi ... S Sudha
-
S.Divya Bharathi, et. al.S.Divya Bharathi ... S Sudha
01 Jul 2018
01 Jul 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Multiple-input multiple-output causal strategies for gene selection

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: BMC Bioinformatics