Abstract

BackgroundData-driven methods that automatically learn relations between attributes from given data are a popular tool for building mathematical models in computational biology. Since measurements are prone to errors, approaches dealing with uncertain data are especially suitable for this task. Fuzzy models are one such approach, but they contain a large amount of parameters and are thus susceptible to over-fitting. Validation methods that help detect over-fitting are therefore needed to eliminate inaccurate models.ResultsWe propose a method to enlarge the validation datasets on which a fuzzy dynamic model of a cellular network can be tested. We apply our method to two data-driven dynamic models of the MAPK signalling pathway and two models of the mammalian circadian clock. We show that random initial state perturbations can drastically increase the mean error of predictions of an inaccurate computational model, while keeping errors of predictions of accurate models small.ConclusionsWith the improvement of validation methods, fuzzy models are becoming more accurate and are thus likely to gain new applications. This field of research is promising not only because fuzzy models can cope with uncertainty, but also because their run time is short compared to conventional modelling methods that are nowadays used in systems biology.

Highlights

  • Data-driven methods that automatically learn relations between attributes from given data are a popular tool for building mathematical models in computational biology

  • Our results show that the model generated with Multi-atribute fuzzy time series method (MAFTS) is much more accurate than the model generated with Fuzzy c-means clustering algorithm (FCM), we were unable to form this conclusion from the testing datasets generated by exclusively Epidermal growth factor (EGF) concentration perturbations

  • In this paper we provided a description of an approach that helps in eliminating inaccurate fuzzy data-driven models through initial state perturbations of a dynamic system

Read more

Summary

Introduction

Data-driven methods that automatically learn relations between attributes from given data are a popular tool for building mathematical models in computational biology. Validation methods that help detect over-fitting are needed to eliminate inaccurate models. A diverse range of methods for building models is available nowadays, with data-driven approaches playing an important role in cases where a large amount of experimental data exists and where prior knowledge of the system’s structure is limited. A promising approach to dealing with this problem are Bayesian networks that allow the incorporation of qualitative data into the structure of the network, the likelihood function and the prior probability distribution of Bayes’ rules [5], with a drawback that the prior probability distribution may sometimes not be available [6].

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call