Abstract

Abstract*Motivation:* The creation of accurate quantitative Systems Biology Markup Language (SBML) models is a time-intensive, manual process often complicated by the many data sources and formats required to annotate even a small and well-scoped model. Ideally, the retrieval and integration of biological knowledge for model annotation should be performed quickly, precisely, and with a minimum of manual effort. Here, we present a method using off-the-shelf semantic web technology which enables this process: the heterogeneous data sources are first syntactically converted into ontologies; these are then aligned to a small domain ontology by applying a rule base. Integrating resources in this way can accommodate multiple formats with different semantics; it provides richly modelled biological knowledge suitable for annotation of SBML models. *Results:* We demonstrate proof-of-principle for this rule-based mediation with two use cases for SBML model annotation. This was implemented with existing tools, decreasing development time and increasing reusability. This initial work establishes the feasibility of this approach as part of an automated SBML model annotation system.*Availability:* Detailed information including download and mapping of the ontologies as well as integration results is available from "http://www.cisban.ac.uk/RBM":http://www.cisban.ac.uk/RBM

Highlights

  • The integration of life sciences data remains an ongoing challenge

  • Rule-based mediation was performed for two use cases to show proof-of-principle in the context of Systems Biology Markup Language (SBML) model annotation

  • We have demonstrated that rule-based mediation and its implementation for our use cases is a suitable method for semantic data integration in the context of model annotation

Read more

Summary

Introduction

The integration of life sciences data remains an ongoing challenge. The multitude of data sources and formats which differ in both syntax and semantics makes this task difficult. Errors in data integration are possible when data sources do not describe their information with a shared semantics (Philippi and Kohler, 2006). The problems of and historical approaches to syntactic and semantic data integration have been well described (Sujansky, 2001; AlonsoCalvo et al, 2007). Though semantic data integration allows for a richer description of the biology than is possible with syntactic methods, semantic data integration in bioinformatics is difficult, partly due to the bespoke nature of the tooling. Mediator-based approaches extend ontology mapping such that a core ontology is mapped to a large number of satellite source ontologies. Mediator-based approaches have viewed the purpose of a core ontology as a union of source ontologies rather than as a semantically-rich description of the research domain

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call