Abstract
SNOMED CT provides a standardized terminology for clinical concepts, allowing cohort queries over heterogeneous clinical data including Electronic Health Records (EHRs). While it is intuitive that missing and inaccurate subtype (or is-a) relations in SNOMED CT reduce the recall and precision of cohort queries, the extent of these impacts has not been formally assessed. This study fills this gap by developing quantitative metrics to measure these impacts and performing statistical analysis on their significance. We used the Optum de-identified COVID-19 Electronic Health Record dataset. We defined micro-averaged and macro-averaged recall and precision metrics to assess the impact of missing and inaccurate is-a relations on cohort queries. Both practical and simulated analyses were performed. Practical analyses involved 407 missing and 48 inaccurate is-a relations confirmed by domain experts, with statistical testing using Wilcoxon signed-rank tests. Simulated analyses used two random sets of 400 is-a relations to simulate missing and inaccurate is-a relations. Wilcoxon signed-rank tests from both practical and simulated analyses (P-values < .001) showed that missing is-a relations significantly reduced the micro- and macro-averaged recall, and inaccurate is-a relations significantly reduced the micro- and macro-averaged precision. The introduced impact metrics can assist SNOMED CT maintainers in prioritizing critical hierarchical defects for quality enhancement. These metrics are generally applicable for assessing the quality impact of a terminology's subtype hierarchy on its cohort query applications. Our results indicate a significant impact of missing and inaccurate is-a relations in SNOMED CT on the recall and precision of cohort queries. Our work highlights the importance of high-quality terminology hierarchy for cohort queries over EHR data and provides valuable insights for prioritizing quality improvements of SNOMED CT's hierarchy.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Journal of the American Medical Informatics Association : JAMIA
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.