Abstract

BackgroundOntologies and taxonomies are among the most important computational resources for molecular biology and bioinformatics. A series of recent papers has shown that the Gene Ontology (GO), the most prominent taxonomic resource in these fields, is marked by flaws of certain characteristic types, which flow from a failure to address basic ontological principles. As yet, no methods have been proposed which would allow ontology curators to pinpoint flawed terms or definitions in ontologies in a systematic way.ResultsWe present computational methods that automatically identify terms and definitions which are defined in a circular or unintelligible way. We further demonstrate the potential of these methods by applying them to isolate a subset of 6001 problematic GO terms. By automatically aligning GO with other ontologies and taxonomies we were able to propose alternative synonyms and definitions for some of these problematic terms. This allows us to demonstrate that these other resources do not contain definitions superior to those supplied by GO.ConclusionOur methods provide reliable indications of the quality of terms and definitions in ontologies and taxonomies. Further, they are well suited to assist ontology curators in drawing their attention to those terms that are ill-defined. We have further shown the limitations of ontology mapping and alignment in assisting ontology curators in rectifying problems, thus pointing to the need for manual curation.

Highlights

  • Ontologies and taxonomies are among the most important computational resources for molecular biology and bioinformatics

  • We note with satisfaction that the Gene Ontology (GO) Consortium has recognized the importance of the problems addressed in this communication, and is taking steps to rectify them in conjunction with the developers of other Open Biomedical Ontologies (OBO) Ontologies

  • The methods introduced in this paper offer what we believe to be a reliable means for assessing the quality of terms and their definitions in ontologies and taxonomies

Read more

Summary

Introduction

Ontologies and taxonomies are among the most important computational resources for molecular biology and bioinformatics. Taxonomies and ontologies are of increasing importance in functional genomics and molecular biology, and the Gene Ontology [1] has established itself as one of the most important computational resources in these and related fields. Several of the ontologies in the Open Biomedical Ontologies (OBO) Consortium, of which GO is the best known resource, have had a major impact on the annotation of genomes [2] and are often used as controlled vocabularies in database integration systems [3]. Our investigation here pertains to the ways GO and similar ontologies fall short of conforming to principles that apply to the naming and definitions of ontological terms. The proposals advanced in [19] are being applied in on-going revisions of GO's definitions

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.