Validating module network learning algorithms using simulated data

Tom Michoel,Koenraad Van Leemput,Yvan Saeys,Anagha Joshi,Tim Van Den Bulcke,Eric Bonnet,Piet Van Remortel,Kathleen Marchal,Yves Van De Peer,Martin Kuiper,Steven Maere

doi:10.1186/1471-2105-8-s2-s5

Tom Michoel, Koenraad Van Leemput + Show 9 more

Open Access

PDF Available

https://doi.org/10.1186/1471-2105-8-s2-s5

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

BackgroundIn recent years, several authors have used probabilistic graphical models to learn expression modules and their regulatory programs from gene expression data. Despite the demonstrated success of such algorithms in uncovering biologically relevant regulatory relations, further developments in the area are hampered by a lack of tools to compare the performance of alternative module network learning strategies. Here, we demonstrate the use of the synthetic data generator SynTReN for the purpose of testing and comparing module network learning algorithms. We introduce a software package for learning module networks, called LeMoNe, which incorporates a novel strategy for learning regulatory programs. Novelties include the use of a bottom-up Bayesian hierarchical clustering to construct the regulatory programs, and the use of a conditional entropy measure to assign regulators to the regulation program nodes. Using SynTReN data, we test the performance of LeMoNe in a completely controlled situation and assess the effect of the methodological changes we made with respect to an existing software package, namely Genomica. Additionally, we assess the effect of various parameters, such as the size of the data set and the amount of noise, on the inference performance.ResultsOverall, application of Genomica and LeMoNe to simulated data sets gave comparable results. However, LeMoNe offers some advantages, one of them being that the learning process is considerably faster for larger data sets. Additionally, we show that the location of the regulators in the LeMoNe regulation programs and their conditional entropy may be used to prioritize regulators for functional validation, and that the combination of the bottom-up clustering strategy with the conditional entropy-based assignment of regulators improves the handling of missing or hidden regulators.ConclusionWe show that data simulators such as SynTReN are very well suited for the purpose of developing, testing and improving module network algorithms. We used SynTReN data to develop and test an alternative module network learning strategy, which is incorporated in the software package LeMoNe, and we provide evidence that this alternative strategy has several advantages with respect to existing methods.

Highlights

In recent years, several authors have used probabilistic graphical models to learn expression modules and their regulatory programs from gene expression data
We used SynTReN data to develop and test an alternative module network learning strategy, which is incorporated in the software package LeMoNe, and we provide evidence that this alternative strategy has several advantages with respect to existing methods
Implementation differences in LeMoNe versus Genomica As a starting point for the development of LeMoNe, we reimplemented the methodology described by Segal et al [6], which is incorporated in the Genomica software package

Summary

Introduction

Several authors have used probabilistic graphical models to learn expression modules and their regulatory programs from gene expression data. Several studies use expression data, promoter motif data, chromatin immunoprecipitation (ChIP) data and/or prior functional information (e.g. GO classifications [2] or known regulatory network structures) in conjunction to elucidate transcriptional regulatory networks [3,4,5,6,7,8,9,10,11,12,13,14,15,16,17] Most of these methods try to unravel the control logic underlying specific expression patterns. Friedman et al pioneered the use of Bayesian networks to learn regulatory networks from expression data [3,4] In these early studies, each gene in the resulting Bayesian network is associated with its individual regulation program, i.e., its own set of parents and conditional probability distribution. As the number of parameters to be estimated in a module network is much smaller than in a full Bayesian network, the currently available gene expression data sets can be large enough for the purpose of learning module networks [6,11,12,19]

Objectives

Methods

Results

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: May 1, 2007
Citations: 39	License type: CC BY 2.0

R Discovery Prime

Validating module network learning algorithms using simulated data

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

RAGAM ALTERNATIF STRATEGI PEMBELAJARAN PAI SELAMA MASA PANDEMI COVID-19 DI SDIT AL-MUNADI MEDAN
Masruroh Lubis ... Fandy Fakhruddin
Jurnal Bilqolam Pendidikan Islam | VOL. 1
Masruroh Lubis, et. al.Masruroh Lubis ... Fandy Fakhruddin
21 Nov 2020
Jurnal Bilqolam Pendidikan Islam | VOL. 1

Developing spatial prioritisation strategies to maximise conservation impact

-

01 Jan 2019
01 Jan 2019

The Feasibility Test of “COVID” Learning Strategy: An Alternative Approach to Cope Learning Process during Pandemic of COVID-19
Sh Sugiharto ... Benny Arief Setyanto
INDONESIAN NURSING JOURNAL OF EDUCATION AND CLINIC (INJEC) | VOL. 6
Sh Sugiharto, et. al.Sh Sugiharto ... Benny Arief Setyanto
23 Sep 2020
INDONESIAN NURSING JOURNAL OF EDUCATION AND CLINIC (INJEC) | VOL. 6

Adjusting to the New Normal: Exploring Alternative Learning Strategies for Devcom Students
...
SSRN Electronic Journal | VOL. -
, et. al. ...
20 Apr 2021
SSRN Electronic Journal | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Validating module network learning algorithms using simulated data

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: BMC Bioinformatics