Abstract

Elucidating gene regulatory network (GRN) from large scale experimental data remains a central challenge in systems biology. Recently, numerous techniques, particularly consensus driven approaches combining different algorithms, have become a potentially promising strategy to infer accurate GRNs. Here, we develop a novel consensus inference algorithm, TopkNet that can integrate multiple algorithms to infer GRNs. Comprehensive performance benchmarking on a cloud computing framework demonstrated that (i) a simple strategy to combine many algorithms does not always lead to performance improvement compared to the cost of consensus and (ii) TopkNet integrating only high-performance algorithms provide significant performance improvement compared to the best individual algorithms and community prediction. These results suggest that a priori determination of high-performance algorithms is a key to reconstruct an unknown regulatory network. Similarity among gene-expression datasets can be useful to determine potential optimal algorithms for reconstruction of unknown regulatory networks, i.e., if expression-data associated with known regulatory network is similar to that with unknown regulatory network, optimal algorithms determined for the known regulatory network can be repurposed to infer the unknown regulatory network. Based on this observation, we developed a quantitative measure of similarity among gene-expression datasets and demonstrated that, if similarity between the two expression datasets is high, TopkNet integrating algorithms that are optimal for known dataset perform well on the unknown dataset. The consensus framework, TopkNet, together with the similarity measure proposed in this study provides a powerful strategy towards harnessing the wisdom of the crowds in reconstruction of unknown regulatory networks.

Highlights

  • Most genes do not exert their functions in isolation [1], but make their functions through regulations among them

  • We took similar approach, ‘‘TopkNet’’, to see if ‘‘Wisdom of Crowd’’ approach can be applied for network reconstruction

  • Such an analysis begs the question ‘‘How to choose optimal algorithms for a given dataset?’’ We found that similarity among gene-expression datasets is a key to select optimal algorithms, i.e., if dataset A for which optimal algorithms are known is similar to dataset B, the optimal algorithms for dataset A may be optimal for dataset B

Read more

Summary

Introduction

Most genes do not exert their functions in isolation [1], but make their functions through regulations among them. Such regulatory interactions are in the same cell, between different cells, and even between different organs, forming large-scale gene regulatory networks (GRNs). Reconstruction of accurate and comprehensive GRNs from genome-wide experimental data (e.g., gene expression data from DNA microarray experiments) is one of the fundamental challenges in systems biology [3,4]. Systematic and comparative assessment of the performance of these inference algorithms remains a major challenge in network reconstruction [33]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call