Structuring and extracting knowledge for the support of hypothesis generation in molecular biology.

Marco Roos,Sophia Katrenko,Willem Robert Van Hage,Konstantinos Krommydas,Pieter W Adriaans,M Scott Marshall,Andrew P Gibson,Martijn Schuemie,Edgar Meij

doi:10.1186/1471-2105-10-s10-s9

Abstract

BackgroundHypothesis generation in molecular and cellular biology is an empirical process in which knowledge derived from prior experiments is distilled into a comprehensible model. The requirement of automated support is exemplified by the difficulty of considering all relevant facts that are contained in the millions of documents available from PubMed. Semantic Web provides tools for sharing prior knowledge, while information retrieval and information extraction techniques enable its extraction from literature. Their combination makes prior knowledge available for computational analysis and inference. While some tools provide complete solutions that limit the control over the modeling and extraction processes, we seek a methodology that supports control by the experimenter over these critical processes.ResultsWe describe progress towards automated support for the generation of biomolecular hypotheses. Semantic Web technologies are used to structure and store knowledge, while a workflow extracts knowledge from text. We designed minimal proto-ontologies in OWL for capturing different aspects of a text mining experiment: the biological hypothesis, text and documents, text mining, and workflow provenance. The models fit a methodology that allows focus on the requirements of a single experiment while supporting reuse and posterior analysis of extracted knowledge from multiple experiments. Our workflow is composed of services from the 'Adaptive Information Disclosure Application' (AIDA) toolkit as well as a few others. The output is a semantic model with putative biological relations, with each relation linked to the corresponding evidence.ConclusionWe demonstrated a 'do-it-yourself' approach for structuring and extracting knowledge in the context of experimental research on biomolecular mechanisms. The methodology can be used to bootstrap the construction of semantically rich biological models using the results of knowledge extraction processes. Models specific to particular experiments can be constructed that, in turn, link with other semantic models, creating a web of knowledge that spans experiments. Mapping mechanisms can link to other knowledge resources such as OBO ontologies or SKOS vocabularies. AIDA Web Services can be used to design personalized knowledge extraction procedures. In our example experiment, we found three proteins (NF-Kappa B, p21, and Bax) potentially playing a role in the interplay between nutrients and epigenetic gene regulation.

Highlights

Hypothesis generation in molecular and cellular biology is an empirical process in which knowledge derived from prior experiments is distilled into a comprehensible model
Mapping mechanisms can link to other knowledge resources such as Open Biomedical Ontology (OBO) ontologies or SKOS vocabularies
Many Web resources are available for molecular biologists to access available knowledge, of which Entrez PubMed, hosted by the US National Center for Biotechnology Information (NCBI), is probably the most used by molecular biologists

Summary

Introduction

Hypothesis generation in molecular and cellular biology is an empirical process in which knowledge derived from prior experiments is distilled into a comprehensible model. The requirement of automated support is exemplified by the difficulty of considering all relevant facts that are contained in the millions of documents available from PubMed. Semantic Web provides tools for sharing prior knowledge, while information retrieval and information extraction techniques enable its extraction from literature. In order to study a biomolecular mechanism such as epigenetic gene control (Figure 1) and formulate a new hypothesis, we usually integrate various types of information to distil a comprehensible model. We can use this model to discuss with our peers before we test the model in the laboratory or by comparison to available data. Tools and algorithms have been developed that match predefined sets of biological terms [7,8], or that use machine learning algorithms to recognize entities and extract relations based on their context in a document http://www.biomedcentral.com/1471-2105/10/S10/S9

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Oct 1, 2009
Citations: 48	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

Structuring and extracting knowledge for the support of hypothesis generation in molecular biology.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

SemIDEA: Towards a Semantic IoT Data Analytic Framework for Facilitating Environmental Protection
Truong Khanh Duy ... A Min Tjoa
-
Truong Khanh Duy, et. al.Truong Khanh Duy ... A Min Tjoa
01 Sep 2019
01 Sep 2019

Semantic models and services for conservation and restoration of cultural heritage: A comprehensive survey
Efthymia Moraitou ... George Caridakis
Semantic Web | VOL. 14
Efthymia Moraitou, et. al.Efthymia Moraitou ... George Caridakis
15 Dec 2022
Semantic Web | VOL. 14

A unified and semantic data model for fog computing
Hoan Le ... Nadjib Achir
-
Hoan Le, et. al.Hoan Le ... Nadjib Achir
28 Oct 2020
28 Oct 2020

Semantic Modelling and Acquisition of Engineering Knowledge
Marta Sabou ... Petr Novák
-
Marta Sabou, et. al.Marta Sabou ... Petr Novák
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Structuring and extracting knowledge for the support of hypothesis generation in molecular biology.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics