Markov chain Monte Carlo for active module identification problem

Nikita Alexeev,Alexey Sergushichev,Vladimir Sukhov,Gennady Korotkevich,Javlon Isomurodov

doi:10.1186/s12859-020-03572-9

Nikita Alexeev, Alexey Sergushichev + Show 3 more

Open Access

https://doi.org/10.1186/s12859-020-03572-9

Copy DOI

Journal: BMC Bioinformatics	Publication Date: Nov 1, 2020
Citations: 5	License type: open-access

Affiliation: ITMO University

Abstract

BackgroundIntegrative network methods are commonly used for interpretation of high-throughput experimental biological data: transcriptomics, proteomics, metabolomics and others. One of the common approaches is finding a connected subnetwork of a global interaction network that best encompasses significant individual changes in the data and represents a so-called active module. Usually methods implementing this approach find a single subnetwork and thus solve a hard classification problem for vertices. This subnetwork inherently contains erroneous vertices, while no instrument is provided to estimate the confidence level of any particular vertex inclusion. To address this issue, in the current study we consider the active module problem as a soft classification problem.ResultsWe propose a method to estimate probabilities of each vertex to belong to the active module based on Markov chain Monte Carlo (MCMC) subnetwork sampling. As an example of the performance of our method on real data, we run it on two gene expression datasets. For the first many-replicate expression dataset we show that the proposed approach is consistent with an existing resampling-based method. On the second dataset the jackknife resampling method is inapplicable due to the small number of biological replicates, but the MCMC method can be run and shows high classification performance.ConclusionsThe proposed method allows to estimate the probability that an individual vertex belongs to the active module as well as the false discovery rate (FDR) for a given set of vertices. Given the estimated probabilities, it becomes possible to provide a connected subgraph in a consistent manner for any given FDR level: no vertex can disappear when the FDR level is relaxed. We show, on both simulated and real datasets, that the proposed method has good computational performance and high classification accuracy.

Highlights

Integrative network methods are commonly used for interpretation of high-throughput experimental biological data: transcriptomics, proteomics, metabolomics and others
In that paper we proposed a semi-heuristic ranking method that was better compared to both baseline vertex ranking by individual input significance and ranking from multiple BioNet runs with different thresholds
Markov chain Monte Carlo (MCMC) approach Our goal is to find out which vertices are likely to belong to the active module M: Problem 1 (Soft Classification Active Module Problem, SCAMP) Given a connected graph G and vertex weights wv ∈[ 0, 1] find the probability P(v ∈ M | W = w) for each vertex v to belong to the module M

Summary

Introduction

Integrative network methods are commonly used for interpretation of high-throughput experimental biological data: transcriptomics, proteomics, metabolomics and others. One of the common approaches is finding a connected subnetwork of a global interaction network that best encompasses significant individual changes in the data and represents a so-called active module. Methods implementing this approach find a single subnetwork and solve a hard classification problem for vertices. This subnetwork inherently contains erroneous vertices, while no instrument is provided to estimate the confidence level of any particular vertex inclusion. More sophisticated methods include using connections for gene set enrichment analysis [6], comparing networks [7] and many others

Objectives

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Markov chain Monte Carlo for active module identification problem

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Expression profile of residual breast cancer after primary chemotherapy based on doxorubicin
M A Koike Folgueira ... C T Oliveira
Journal of Clinical Oncology | VOL. 23
M A Koike Folgueira, et. al.M A Koike Folgueira ... C T Oliveira
01 Jun 2005
Journal of Clinical Oncology | VOL. 23

Multiple testing on standardized mortality ratios: a Bayesian hierarchical model for FDR estimation
M Ventrucci ... D Cocchi
Biostatistics | VOL. 12
M Ventrucci, et. al.M Ventrucci ... D Cocchi
24 Jun 2010
Biostatistics | VOL. 12

Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data
Kristina M Hettne ... Esther De Jong
BMC Medical Genomics | VOL. 6
Kristina M Hettne, et. al.Kristina M Hettne ... Esther De Jong
29 Jan 2013
BMC Medical Genomics | VOL. 6

Progressive calibration and averaging for tandem mass spectrometry statistical confidence estimation: Why settle for a single decoy?
Uri Keich ... William Stafford Noble
Research in computational molecular biology : ... Annual International Conference, RECOMB ... : proceedings. RECOMB (Conference : 2005- ) | VOL. 10229
Uri Keich, et. al.Uri Keich ... William Stafford Noble
01 Jan 2017
Research in computational molecular biology : ... Annual International Conference, RECOMB ... : proceedings. RECOMB (Conference : 2005- ) | VOL. 10229

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Markov chain Monte Carlo for active module identification problem

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics