Abstract

BackgroundSummarization networks are compact summaries of ontologies. The “Big Picture” view offered by summarization networks enables to identify sets of concepts that are more likely to have errors than control concepts. For ontologies that have outgoing lateral relationships, we have developed the "partial-area taxonomy" summarization network. Prior research has identified one kind of outlier concepts, concepts of small partials-areas within partial-area taxonomies. Previously we have shown that the small partial-area technique works successfully for four ontologies (or their hierarchies).MethodsTo improve the Quality Assurance (QA) scalability, a family-based QA framework, where one QA technique is potentially applicable to a whole family of ontologies with similar structural features, was developed. The 373 ontologies hosted at the NCBO BioPortal in 2015 were classified into a collection of families based on structural features. A meta-ontology represents this family collection, including one family of ontologies having outgoing lateral relationships. The process of updating the current meta-ontology is described. To conclude that one QA technique is applicable for at least half of the members for a family F, this technique should be demonstrated as successful for six out of six ontologies in F. We describe a hypothesis setting the condition required for a technique to be successful for a given ontology. The process of a study to demonstrate such success is described. This paper intends to prove the scalability of the small partial-area technique.ResultsWe first updated the meta-ontology classifying 566 BioPortal ontologies. There were 371 ontologies in the family with outgoing lateral relationships. We demonstrated the success of the small partial-area technique for two ontology hierarchies which belong to this family, SNOMED CT’s Specimen hierarchy and NCIt’s Gene hierarchy. Together with the four previous ontologies from the same family, we fulfilled the “six out of six” condition required to show the scalability for the whole family.ConclusionsWe have shown that the small partial-area technique can be potentially successful for the family of ontologies with outgoing lateral relationships in BioPortal, thus improve the scalability of this QA technique.

Highlights

  • Summarization networks are compact summaries of ontologies

  • In this paper, we investigate this technique on two more ontologies: SNOMED SNOMED Clinical Terms (CT)’s Specimen hierarchy and National Cancer Institute Thesaurus (NCIt)’s Gene hierarchy, which belong to the same family as the previous four ontologies

  • Results of the Quality Assurance (QA) study on the SNOMED CT Specimen hierarchy The partial-area taxonomy derived from the Specimen hierarchy with 1696 concepts in the SNOMED CT January 2018 release has 23 areas and 530 partial-areas

Read more

Summary

Introduction

Summarization networks are compact summaries of ontologies. The “Big Picture” view offered by summarization networks enables to identify sets of concepts that are more likely to have errors than control concepts. For ontologies that have outgoing lateral relationships, we have developed the "partial-area taxonomy" summarization network. Prior research has identified one kind of outlier concepts, concepts of small partials-areas within partial-area taxonomies. Biomedical ontologies are essential for biomedical information systems and for their interoperability [1,2,3,4,5] They are critical for biomedical research, e.g., phenotyping with EHR text [3, 6,7,8,9]. The non-hierarchical semantic relationship is called lateral relationship, in contrast to the hierarchical is-a relationship. An area is a group of all the concepts having exactly the same set of lateral relationship types. A partial-area is a subunit in an area defined by a root concept describing the semantic of the partial-area, including its all descendant concepts within the area sharing the same semantic

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call