Abstract

Bayesian networks is a powerful method for identifying causal relationships among variables. However, as the network size increases, the time complexity of searching the optimal structure grows exponentially. We proposed a novel search algorithm - Fast and Furious Bayesian Network (FFBN). Compared to the existing greedy search algorithm, FFBN uses significantly fewer model configuration rules to determine the causal direction of edges when constructing the Bayesian network, which leads to greatly improved computational speed. We benchmarked the performance of FFBN by reconstructing gene regulatory networks (GRNs) from two DREAM5 challenge datasets: a synthetic dataset and a larger yeast transcriptome dataset. In both datasets, FFBN shows a much faster speed than the existing greedy search algorithm, while maintaining equally good or better performance in recall and precision. We then constructed three whole transcriptome GRNs for primary liver cancer (PL), primary colon cancer (PC) and colon to liver metastasis (CLM) expression data, which the existing greedy search algorithms failed. Three GRNs contain 12,099 common genes. Unprecedentedly, our newly developed FFBN algorithm is able to build up GRNs at a scale larger than 10,000 genes. Using FFBN, we discovered that CLM has its unique cancer molecular mechanisms and shares a certain degree of similarity with both PL and PC.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call