Abstract

The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce the computational cost of BN due to much smaller sizes of local GRNs, but also identify the directions of the regulations.

Highlights

  • Gene regulatory networks (GRNs) that explicitly characterize regulatory processes in cells are typically modeled by graphs, in which the nodes represent the genes and the edges reflect the regulatory or interaction relationship between genes [1]

  • We propose a novel local Bayesian network (LBN) algorithm to improve the accuracy of GRN inference from gene expression data by exploring advantages of Bayesian network (BN) and conditional mutual information (CMI) methods

  • These results indicate that our strategy of the network decomposition can significantly reduce the high computation cost of the BN method for large-scale GRNs, whereas the strategy of deleting the false-positive edges with CMI and k-nearest neighbor (kNN) can remarkably enhance the accuracy of the network inference

Read more

Summary

Introduction

Gene regulatory networks (GRNs) that explicitly characterize regulatory processes in cells are typically modeled by graphs, in which the nodes represent the genes and the edges reflect the regulatory or interaction relationship between genes [1]. The supervised learning-based methods make use of the known regulations to infer GRNs on a genome-wide data, such as SEREND [8], GENIES [9] and SIRENE [11], but require additional information of the regulatory interactions to train a model. The insufficient information of the labeled or known gene datasets limits the application of this kind of approaches [47, 48]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call