Ontological interpretation of biomedical database content

Filipe Santana Da Silva,Stefan Schulz,Ludger Jansen,Fred Freitas

doi:10.1186/s13326-017-0127-z

Filipe Santana Da Silva, Stefan Schulz + Show 2 more

Open Access

https://doi.org/10.1186/s13326-017-0127-z

Copy DOI

Abstract

BackgroundBiological databases store data about laboratory experiments, together with semantic annotations, in order to support data aggregation and retrieval. The exact meaning of such annotations in the context of a database record is often ambiguous. We address this problem by grounding implicit and explicit database content in a formal-ontological framework.MethodsBy using a typical extract from the databases UniProt and Ensembl, annotated with content from GO, PR, ChEBI and NCBI Taxonomy, we created four ontological models (in OWL), which generate explicit, distinct interpretations under the BioTopLite2 (BTL2) upper-level ontology. The first three models interpret database entries as individuals (IND), defined classes (SUBC), and classes with dispositions (DISP), respectively; the fourth model (HYBR) is a combination of SUBC and DISP. For the evaluation of these four models, we consider (i) database content retrieval, using ontologies as query vocabulary; (ii) information completeness; and, (iii) DL complexity and decidability. The models were tested under these criteria against four competency questions (CQs).ResultsIND does not raise any ontological claim, besides asserting the existence of sample individuals and relations among them. Modelling patterns have to be created for each type of annotation referent. SUBC is interpreted regarding maximally fine-grained defined subclasses under the classes referred to by the data. DISP attempts to extract truly ontological statements from the database records, claiming the existence of dispositions. HYBR is a hybrid of SUBC and DISP and is more parsimonious regarding expressiveness and query answering complexity. For each of the four models, the four CQs were submitted as DL queries. This shows the ability to retrieve individuals with IND, and classes in SUBC and HYBR. DISP does not retrieve anything because the axioms with disposition are embedded in General Class Inclusion (GCI) statements.ConclusionAmbiguity of biological database content is addressed by a method that identifies implicit knowledge behind semantic annotations in biological databases and grounds it in an expressive upper-level ontology. The result is a seamless representation of database structure, content and annotations as OWL models.

Highlights

Biological databases store data about laboratory experiments, together with semantic annotations, in order to support data aggregation and retrieval
Apart from the OWL profiles required, the result shows how individuals can be retrieved with IND, and classes in twostep queries for SUBC and Hybrid representation with subclasses and dispositions (HYBR)
In IND, there are more axioms than in SUBC, DISP and HYBR due to the large amount of relationships created among the individuals while an OWL model following the IND strategy may not include any class definitions

Summary

Introduction

Biological databases store data about laboratory experiments, together with semantic annotations, in order to support data aggregation and retrieval. Database records from the Unified Protein Resource (UniProt) [1] are annotated with. As much as these domain ontologies, in isolation, obey formal principles and good practice guidelines [4, 5], as little the meaning of the annotations themselves has been formalized so far. UniProt Core includes the description on database fields related to each other, but without formalization and links to GO (for example). This can constitute a source of misunderstanding and hamper correct data interpretation, leading to doubtful or wrong conclusions

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Biomedical Semantics	Publication Date: Jun 26, 2017
Citations: 7	License type: open-access

R Discovery Prime

R Discovery Prime

Ontological interpretation of biomedical database content

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Biomedical Semantics

Lead the way for us

Similar Papers

OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web
A Pareja-Lora
-
A Pareja-LoraA Pareja-Lora
09 Nov 2012
09 Nov 2012

Guidelines and recommendations for content, structure, and deployment of mutation databases.
C.R Scriver ... H Lehv�Slaiho
Human mutation | VOL. 13
C.R Scriver, et. al.C.R Scriver ... H Lehv�Slaiho
01 Jan 1998
Human mutation | VOL. 13

A semantic analysis of the annotations of the human genome
P Khatri ... A Done
Bioinformatics | VOL. 21
P Khatri, et. al.P Khatri ... A Done
14 Jun 2005
Bioinformatics | VOL. 21

An approach to development of ontological knowledge base in the field of scientific and research activity in Russia
M Sh Murtazina ... T V Avdeenko
Journal of Physics: Conference Series | VOL. 1015
M Sh Murtazina, et. al.M Sh Murtazina ... T V Avdeenko
01 May 2018
Journal of Physics: Conference Series | VOL. 1015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Ontological interpretation of biomedical database content

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Biomedical Semantics