Abstract

BackgroundTo systematically understand the interactions between numerous biological components, a variety of biological networks on different levels and scales have been constructed and made available in public databases or knowledge repositories. Graphical models such as structural equation models have long been used to describe biological networks for various quantitative analysis tasks, especially key biological parameter estimation. However, limited by resources or technical capacities, partial observation is a common problem in experimental observations of biological networks, and it thus becomes an important problem how to select unobserved nodes for additional measurements such that all unknown model parameters become identifiable. To the best knowledge of our authors, a solution to this problem does not exist until this study.ResultsThe identifiability-based observation problem for biological networks is mathematically formulated for the first time based on linear recursive structural equation models, and then a dynamic programming strategy is developed to obtain the optimal observation strategies. The efficiency of the dynamic programming algorithm is achieved by avoiding both symbolic computation and matrix operations as used in other studies. We also provided necessary theoretical justifications to the proposed method. Finally, we verified the algorithm using synthetic network structures and illustrated the application of the proposed method in practice using a real biological network related to influenza A virus infection.ConclusionsThe proposed approach is the first solution to the structural identifiability-based optimal observation remedy problem. It is applicable to an arbitrary directed acyclic biological network (recursive SEMs) without bidirectional edges, and it is a computerizable method. Observation remedy is an important issue in experiment design for biological networks, and we believe that this study provides a solid basis for dealing with more challenging design issues (e.g., feedback loops, dynamic or nonlinear networks) in the future. We implemented our method in R, which is freely accessible at https://github.com/Hongyu-Miao/SIOOR.

Highlights

  • To systematically understand the interactions between numerous biological components, a variety of biological networks on different levels and scales have been constructed and made available in public databases or knowledge repositories

  • Identifiability analysis has long been recognized as a powerful tool to assure the accuracy and reliability of parameter estimation techniques; identifiability-based observation strategy design for biological networks turns out to be an unexplored field despite its substantial importance to biological network studies like structure identification

  • For a given network structure, the key idea is to turn a minimum number of unobserved nodes in the original observation strategy into observed such that the number of non-redundant identifiability equations becomes greater than or equal to the number of unknown model parameters

Read more

Summary

Introduction

To systematically understand the interactions between numerous biological components, a variety of biological networks on different levels and scales have been constructed and made available in public databases or knowledge repositories Graphical models such as structural equation models have long been used to describe biological networks for various quantitative analysis tasks, especially key biological parameter estimation. To understand the responses of a biological network (e.g., activation or inhibition) to different environmental signals (e.g., different signaling molecules or different doses of the same signaling molecule), edge coefficients are likely to vary under different conditions and need to be estimated under each condition for the same given network structure [18] In such a scenario, the structure of the corresponding graphical model is known and fixed, concerns about the accuracy and reliability of parameter estimates often raise due to, e.g., the existence of unobserved node variables (i.e., latent variables). A natural question to ask is: what is the remedy that enables us to obtain reliable parameter estimates for a given graphical model structure with partially observed variables?

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call