Abstract
The Human Phenotype Ontology (HPO) is widely used for annotating clinical text data, and sufficient annotation is crucial for the effective utilization of clinical texts. It was known that the use of LLMs can successfully extract symptoms and findings, but cannot annotate them with the HPO. We hypothesized that one of the potential issue for this is the lack of appropriate terms in the HPO. Therefore, during the Biomedical Linked Annotation Hackathon 8 (BLAH8), we attempted the following two tasks in order to grasp the overall picture of HPO. (1) Extract all HPO terms for each of the 23 HPO subclasses (defined as categories) directly under the HPO "Phenotypic abnormality" and then (2) search for major attributes in each of 23 categories. We employed LLM for these two tasks related to examining HPO and, at the same time, found that LLM didn't work well without ingenuity for tasks that lacked sentences and context. A manual search for terms within each category revealed that the HPO contains a mix of terms with four major attributes: (1) Disease Name, (2) Condition, (3) Test Data, and (4) Symptoms and Findings. Manual curation showed that the ratio of symptoms and findings varied from 0 to 93.1% across categories. For clinicians, who are end-users of medical terminology including HPO, it is difficult to understand ontologies. However, for good quality ontology is also important for good-quality data, and a clinician’s help is essential. It is also important to make the overall picture and limitations of ontologies easy to understand in order to bring out the explanatory power of LLMs and artificial intelligence.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.