Abstract

Molecular entities work in concert as a system and mediate phenotypic outcomes and disease states. There has been recent interest in modelling the associations between molecular entities from their observed expression profiles as networks using a battery of algorithms. These networks have proven to be useful abstractions of the underlying pathways and signalling mechanisms. Noise is ubiquitous in molecular data and can have a pronounced effect on the inferred network. Noise can be an outcome of several factors including: inherent stochastic mechanisms at the molecular level, variation in the abundance of molecules, heterogeneity, sensitivity of the biological assay or measurement artefacts prevalent especially in high-throughput settings. The present study investigates the impact of discrepancies in noise variance on pair-wise dependencies, conditional dependencies and constraint-based Bayesian network structure learning algorithms that incorporate conditional independence tests as a part of the learning process. Popular network motifs and fundamental connections, namely: (a) common-effect, (b) three-chain, and (c) coherent type-I feed-forward loop (FFL) are investigated. The choice of these elementary networks can be attributed to their prevalence across more complex networks. Analytical expressions elucidating the impact of discrepancies in noise variance on pairwise dependencies and conditional dependencies for special cases of these motifs are presented. Subsequently, the impact of noise on two popular constraint-based Bayesian network structure learning algorithms such as Grow-Shrink (GS) and Incremental Association Markov Blanket (IAMB) that implicitly incorporate tests for conditional independence is investigated. Finally, the impact of noise on networks inferred from publicly available single cell molecular expression profiles is investigated. While discrepancies in noise variance are overlooked in routine molecular network inference, the results presented clearly elucidate their non-trivial impact on the conclusions that in turn can challenge the biological significance of the findings. The analytical treatment and arguments presented are generic and not restricted to molecular data sets.

Highlights

  • Identifying associations and network structures from observational data sets obtained across a given set of entities is a challenging problem and of great interest across a spectrum of disciplines including molecular biology [1,2,3,4,5,6,7,8]

  • Discrepancies in the noise variances across the nodes are represented by parameters ciw0, i~1,2

  • The noise variance at the node x is fixed at unit variance whereas those at yðc1w0Þ and z(c2w0) are varied systematically in order to understand the impact of discrepancy in noise variance on the conclusions

Read more

Summary

Introduction

Identifying associations and network structures from observational data sets obtained across a given set of entities is a challenging problem and of great interest across a spectrum of disciplines including molecular biology [1,2,3,4,5,6,7,8]. Constraint-based structure-learning algorithms by their very nature do not accommodate the presence of cycles and feedback between the molecules of interest which is an inherent limitation They have proven to be useful approximations of pathways and signalling mechanisms [6,7,13]. The present study investigates the impact of noise on pair-wise correlation, partial correlation and constraint-based structure learning algorithms by considering static data sets generated from linear models of popular network motifs and publicly available molecular expression data [7]. Analytical expressions for large discrepancies in noise variance on pairwise (correlation coefficient) and conditional dependencies (partial correlation) are investigated The impact of such discrepancies on constraintbased Bayesian network structure learning is investigated. The presence of significant discrepancies in noise variance and its impact on network inference from experimental molecular expression profiles [7] is investigated

Methods and Results
Constraint-based Bayesian Network Structure
Simulation Results
Application to Molecular Expression Profiles
Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call