Abstract

Ontologies provide a structured representation of the concepts of a domain of knowledge as well as the relations between them. Attribute ontologies are used to describe the characteristics of the items of a domain, such as the functions of proteins or the signs and symptoms of disease, which opens the possibility of searching a database of items for the best match to a list of observed or desired attributes. However, naive search methods do not perform well on realistic data because of noise in the data, imprecision in typical queries and because individual items may not display all attributes of the category they belong to. We present a method for combining ontological analysis with Bayesian networks to deal with noise, imprecision and attribute frequencies and demonstrate an application of our method as a differential diagnostic support system for human genetics. We provide an implementation for the algorithm and the benchmark at http://compbio.charite.de/boqa/. Sebastian.Bauer@charite.de or Peter.Robinson@charite.de Supplementary Material for this article is available at Bioinformatics online.

Highlights

  • Ontologies are knowledge representations using controlled vocabularies that are designed to help knowledge sharing and computer reasoning (Robinson and Bauer, 2011)

  • We develop the Bayesian Ontology Query Algorithm (BOQA), which, in contrast to previous approaches, integrates the knowledge stored in an ontology and the accompanying annotations into a Bayesian network (Neapolitan, 2003) in order to implement a search system in which users enter one or more terms of the ontology to get a list of the best matching domain items

  • receiver operating characteristic (ROC) and precision/recall analysis were used to compare the performance of BOQA with two other ontology-based search procedures

Read more

Summary

Introduction

Ontologies are knowledge representations using controlled vocabularies that are designed to help knowledge sharing and computer reasoning (Robinson and Bauer, 2011). Ontologies have become essential components of search engines for the world-wide web, e-commerce and medicine (Kohler et al, 2009; Labrou and Finin, 1999; McGuinness, 2003) They are used to represent items of a domain of knowledge, e.g. the ChEBI ontology provides a comprehensive representation of biologically relevant small molecules (Degtyarenko et al, 2008) and to represent the attributes of the items of a domain; for instance, the Gene Ontology (GO) provides a comprehensive representation of gene functions (Ashburner et al, 2000), i.e. the attributes of items of the domain of molecular biology. If a gene is annotated to the GO term ATP binding, it is implicitly annotated to all ancestors of the term including nucleotide binding. This leads to statistical dependencies between ontology terms that can substantially degrade the performance of ontology analysis methods (Alexa et al, 2006; Bauer et al, 2010; Grossmann et al, 2007; Lu et al, 2008)

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.