Abstract
Bayesian network and linear regression methods have been widely applied to reconstruct cellular regulatory networks. In this work, we propose a Bayesian model averaging for linear regression (BMALR) method to infer molecular interactions in biological systems. This method uses a new closed form solution to compute the posterior probabilities of the edges from regulators to the target gene within a hybrid framework of Bayesian model averaging and linear regression methods. We have assessed the performance of BMALR by benchmarking on both in silico DREAM datasets and real experimental datasets. The results show that BMALR achieves both high prediction accuracy and high computational efficiency across different benchmarks. A pre-processing of the datasets with the log transformation can further improve the performance of BMALR, leading to a new top overall performance. In addition, BMALR can achieve robust high performance in community predictions when it is combined with other competing methods. The proposed method BMALR is competitive compared to the existing network inference methods. Therefore, BMALR will be useful to infer regulatory interactions in biological networks. A free open source software tool for the BMALR algorithm is available at https://sites.google.com/site/bmalr4netinfer/.
Highlights
With advances of high-throughput experimental technologies, plenty of network inference methods have been developed to identify regulatory interactions in cellular networks from quantitative experimental data
The results show that Bayesian model averaging for linear regression (BMALR) obtained the highest overall score among all the applied network inference methods
We propose a Bayesian Model Averaging for Linear Regression (BMALR) method to reconstruct cellular regulatory networks
Summary
With advances of high-throughput experimental technologies, plenty of network inference methods have been developed to identify regulatory interactions in cellular networks from quantitative experimental data. To infer the interactions of network variables, one strategy is to find a directed acyclic graph that most likely generates observed experimental data, which are assumed to be a steady data set for static Bayesian networks This is performed by evaluating each possible graph with a score-based approach in the Bayesian context and subsequently search for the graph that maximizes the score.[16] The score function is defined with two common probabilistic models: linear Gaussian models and multinomial models.[3] it is a computationally laborious problem to evaluate all possible graphs that correspond to all possible interactions and choose the best scoring graph.[17,18] To address this problem, heuristic search methods (e.g.: the greedyhill climbing approach) were proposed.[5] On the other hand, given limited amounts of data, a variety of graph structures may describe the data well. A network-averaging strategy was proposed to find the consensus interactions present in most of the high-scoring graphs.[5,19]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.