Abstract
BackgroundReconstructing gene regulatory networks (GRNs) from expression data is a challenging task that has become essential to the understanding of complex regulatory mechanisms in cells. The major issues are the usually very high ratio of number of genes to sample size, and the noise in the available data. Integrating biological prior knowledge to the learning process is a natural and promising way to partially compensate for the lack of reliable expression data and to increase the accuracy of network reconstruction algorithms.ResultsIn this manuscript, we present PriorPC, a new algorithm based on the PC algorithm. PC algorithm is one of the most popular methods for Bayesian network reconstruction. The result of PC is known to depend on the order in which conditional independence tests are processed, especially for large networks. PriorPC uses prior knowledge to exclude unlikely edges from network estimation and introduces a particular ordering for the conditional independence tests. We show on synthetic data that the structural accuracy of networks obtained with PriorPC is greatly improved compared to PC.ConclusionPriorPC improves structural accuracy of inferred gene networks by using soft priors which assign to edges a probability of existence. It is robust to false prior which is not avoidable in the context of biological data. PriorPC is also fast and scales well for large networks which is important for its applicability to real data.Electronic supplementary materialThe online version of this article (doi:10.1186/s12918-015-0233-4) contains supplementary material, which is available to authorized users.
Highlights
Reconstructing gene regulatory networks (GRNs) from expression data is a challenging task that has become essential to the understanding of complex regulatory mechanisms in cells
GRN reconstruction from expression data is a challenging problem in systems biology, because it suffers from high dimensionality and low sample size, as the number of genes is generally much larger than the biological samples, and because biological measurements are extremely noisy
Our results show that the precision of the networks obtained with PriorPC is greatly improved over that of the networks obtained with PC, for every dataset
Summary
Reconstructing gene regulatory networks (GRNs) from expression data is a challenging task that has become essential to the understanding of complex regulatory mechanisms in cells. For instance ChIP-seq data can reveal potential target genes for transcription factors (TFs) Each of these sources is limited and noisy, and only gives a partial picture of gene regulation. Taken together, they can help build a more robust description of the regulatory mechanisms, and reduce the effects of noise and sparsity in expression data. These pieces of information can be included in the process of GRN reconstruction in the form of prior knowledge, i.e. a subjective (but non-arbitrary) belief about how the network should look like. The use of prior information in network inference is a growing trend in computational biology [8,9,10,11]
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have