Abstract

This study tests coverage of SNOMED CT as an expansion source in the process of automated expansion of clinical terms found in discharge summaries. Term expansion is commonly used as a technique in knowledge extraction, query formulation and semantic modelling among other applications. However, characteristics of the sources might affect credibility of outputs, and coverage is one of them. We developed an automated method for testing coverage of more than one source at a time. We used several methods to clean our corpus of discharge summaries before we extracted text fragments as candidates for clinical concepts. We then used Unified Medical Language System (UMLS) sources and UMLS REST API to filter concepts from the pool of text fragments. Statistical measures like true positive rate and false negative rate were used to decide on the coverage of the source. We also tested the coverage of the individual SNOMED CT hierarchies using the same methods. Findings suggest that a combination of four terminologies tested (SNOMED CT, NCI, LNC and MSH) achieves over 90% of coverage for term expansion. We also found that the SNOMED CT hierarchies that hold clinically relevant concepts provided 60% of coverage. We believe that our findings and the method we developed will be of use to both scientists and practitioners working in the domain of knowledge extraction.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.