Abstract

Unprincipled modeling decisions in large-domain ontologies, such as SNOMED CT, are problematic and might act as a barrier for their quality assurance and successful use in electronic health records. Most previous work has focused on clustering problematic concepts, which is helpful for quality control but faces difficulties in pinpointing the origin of those modeling problems. In this study, we examined the underlying structural patterns in SNOMED CT’s data model as such patterns directly reflect the modeling strategies of editors. Our results showed that 92% of all structural patterns found accumulated in the Procedure and Clinical finding sub-hierarchies, and pattern reuse was low; over 30% of patterns were only used once. A qualitative analysis of a sample of 50 such singleton patterns revealed modeling problems, including redundancy, omission, and inconsistency. The problems detected in the sample suggest that the analysis of structural patterns is a valuable technique for revealing problematic areas of SNOMED CT and modeling the styles of terminology editors. Furthermore, the patterns that describe the modeling of a large number of concepts could provide insights for template creation and refinement in SNOMED CT.

Highlights

  • SNOMED CT is a comprehensive multilingual clinical terminology designed to support the consistent and processable representation of clinical content in electronic health records [1]

  • We aimed to improve our understanding of modeling idiosyncrasies in SNOMED CT, a comprehensive clinical health terminology modeled using description logics that constitutes an important blueprint for terminology standardization

  • We studied the distribution of structural patterns in stated relationships breadthwise and depthwise, as they directly represent the modeling strategies of SNOMED CT editors

Read more

Summary

Introduction

SNOMED CT is a comprehensive multilingual clinical terminology designed to support the consistent and processable representation of clinical content in electronic health records [1]. In contrast with other terminologies, SNOMED CT is based on a solid ontological framework that uses formal reasoning in its content production and maintenance. In SNOMED CT, terms are organized hierarchically in over 300 000 representational units, called SNOMED CT concepts, which are described by formal axioms conforming to the EL++ description logics standard [5]. In this sense, SNOMED CT is unique among the currently used clinical terminologies. Its sheer size means that formal or automated methods are required for quality assurance and that new methodologies for examining it are needed, a fact that has been acknowledged since the preliminary stages of SNOMED CT, when lexical techniques were already being investigated [6]

Objectives
Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.