Abstract

Inferring transcriptional gene regulatory networks from transcriptomic datasets is a key challenge of systems biology, with potential impacts ranging from medicine to agronomy. There are several techniques used presently to experimentally assay transcription factors to target relationships, defining important information about real gene regulatory networks connections. These techniques include classical ChIP-seq, yeast one-hybrid, or more recently, DAP-seq or target technologies. These techniques are usually used to validate algorithm predictions. Here, we developed a reverse engineering approach based on mathematical and computer simulation to evaluate the impact that this prior knowledge on gene regulatory networks may have on training machine learning algorithms. First, we developed a gene regulatory networks-simulating engine called FRANK (Fast Randomizing Algorithm for Network Knowledge) that is able to simulate large gene regulatory networks (containing 104 genes) with characteristics of gene regulatory networks observed in vivo. FRANK also generates stable or oscillatory gene expression directly produced by the simulated gene regulatory networks. The development of FRANK leads to important general conclusions concerning the design of large and stable gene regulatory networks harboring scale free properties (built ex nihilo). In combination with supervised (accepting prior knowledge) support vector machine algorithm we (i) address biologically oriented questions concerning our capacity to accurately reconstruct gene regulatory networks and in particular we demonstrate that prior-knowledge structure is crucial for accurate learning, and (ii) draw conclusions to inform experimental design to performed learning able to solve gene regulatory networks in the future. By demonstrating that our predictions concerning the influence of the prior-knowledge structure on support vector machine learning capacity holds true on real data (Escherichia coli K14 network reconstruction using network and transcriptomic data), we show that the formalism used to build FRANK can to some extent be a reasonable model for gene regulatory networks in real cells.

Highlights

  • Gene regulation plays a key role in the control of fundamental processes in living organisms, ranging from development, to nutrition and metabolic coordination

  • During the several DREAM challenges inference of GRNs? What proportion of TF or target gene that focused on GRNs inference, the machine learning procedures expression are needed to properly infer GRNs? Are machine are trained on simulated gene expression, on mutant versions of learning procedures resilient to bad quality prior knowledge in the networks, as well as on perturbed networks, where expression inferring GRNs from it? we propose answers to these of several genes is modified to simulate external influencing factors. questions derived from our in silico simulations

  • We applied bootstrap techniques in order to evaluate the fact that % of True positive are systematically higher than the % of False positive in the case of TA-oriented prior knowledge, as reported by volume under the surface (VUS) (Fig. 5c)

Read more

Summary

INTRODUCTION

Gene regulation plays a key role in the control of fundamental processes in living organisms, ranging from development, to nutrition and metabolic coordination. Of the structure of the network, as well as the characteristics of the FRANK was designed to quickly generate several hundreds of data needed to perform good reconstruction In this sense, our different large networks having different tunable parameters and work is a very much biologically oriented and proposes math- their corresponding simulated expression. The constraints appearing on the right hand side of the linear, the LASSO estimate, providing nice interpretation equation above just reads “points such that y = +1 are on properties, has poor prediction power and is suited for network one side of the hyperplane and points such that y = −1 are on inference, whereas, as explained earlier, we are fundamentally going through a learning approach

Part II: Biological insights using FRANK and SVM
DISCUSSION
Findings
METHODS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call