Abstract

BackgroundModel rejections lie at the heart of systems biology, since they provide conclusive statements: that the corresponding mechanistic assumptions do not serve as valid explanations for the experimental data. Rejections are usually done using e.g. the chi-square test (χ2) or the Durbin-Watson test (DW). Analytical formulas for the corresponding distributions rely on assumptions that typically are not fulfilled. This problem is partly alleviated by the usage of bootstrapping, a computationally heavy approach to calculate an empirical distribution. Bootstrapping also allows for a natural extension to estimation of joint distributions, but this feature has so far been little exploited.ResultsWe herein show that simplistic combinations of bootstrapped tests, like the max or min of the individual p-values, give inconsistent, i.e. overly conservative or liberal, results. A new two-dimensional (2D) approach based on parametric bootstrapping, on the other hand, is found both consistent and with a higher power than the individual tests, when tested on static and dynamic examples where the truth is known. In the same examples, the most superior test is a 2D χ2vsχ2, where the second χ2-value comes from an additional help model, and its ability to describe bootstraps from the tested model. This superiority is lost if the help model is too simple, or too flexible. If a useful help model is found, the most powerful approach is the bootstrapped log-likelihood ratio (LHR). We show that this is because the LHR is one-dimensional, because the second dimension comes at a cost, and because LHR has retained most of the crucial information in the 2D distribution. These approaches statistically resolve a previously published rejection example for the first time.ConclusionsWe have shown how to, and how not to, combine tests in a bootstrap setting, when the combination is advantageous, and when it is advantageous to include a second model. These results also provide a deeper insight into the original motivation for formulating the LHR, for the more general setting of nonlinear and non-nested models. These insights are valuable in cases when accuracy and power, rather than computational speed, are prioritized.

Highlights

  • Model rejections lie at the heart of systems biology, since they provide conclusive statements: that the corresponding mechanistic assumptions do not serve as valid explanations for the experimental data

  • We tested these approaches on static linear examples mainly for the following two reasons: firstly static models are common in science and our methods should aim to be applicable to these kind of problems; secondly the solutions to the corresponding optimization problems are for these static linear examples unique and analytically attainable

  • For each such data set both models served as H0, and were consecutively fitted to the data, and the Goodness of Fit (GOF) was evaluated using various bootstrap approaches, starting with the simplistic combinations described earlier

Read more

Summary

Introduction

Model rejections lie at the heart of systems biology, since they provide conclusive statements: that the corresponding mechanistic assumptions do not serve as valid explanations for the experimental data. Some commonly used assumptions are that the experimental noise is normally or log-normally distributed, that the parameter estimates have converged, and that the parameters appear linearly in the model [15,16,17,18] Because many of these assumptions are unfulfilled in systems biology problems, it is problematic to use these analytical expression. Some of the reasons why the assumptions often are unfulfilled include that the availability of data in systems biology examples often is severely limiting, that the signal-to-noise ratio is poor, that the number of parameters that appears non-linearly and/or are unidentifiable often are high, and, for model comparison approaches, such as the likelihood ratio test, that the tested models are not nested [18,19,20,21,22,23,24]. For more information on these assumptions and limitations, we refer the reader to our previous paper [2]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call