Abstract

PurposeEffective communication is crucial in the medical field where different stakeholders use various terminologies to describe and classify healthcare concepts such as ICD, SNOMED CT, UMLS and MeSH, but the problem of polysemy can make natural language processing difficult. This study explores the contextual meanings of the term “pattern” in the biomedical literature, compares them to existing definitions, annotates a corpus for use in machine learning and proposes new definitions of terms such as “Syndrome, feature” and “pattern recognition.”Design/methodology/approachEntrez API was used to retrieve articles form PubMed for the study which assembled a corpus of 398 articles using a search query for the ambiguous term “pattern” in the titles or abstracts. The python NLTK library was used to extract the terms and their contexts, and an expert check was carried out. To understand the various meanings of the term, the contextual environment was analyzed by extracting the surrounding words of the term. The expert determined the appropriate size of the context for analysis to gain a more nuanced understanding of the different meanings of the term pattern.FindingsThe study found that the categories of meanings of the term “pattern” are broader in biomedical publications than in common definitions, and new categories have been emerging from the term's use in the biomedical field. The study highlights the importance of annotated corpora in advancing natural language processing techniques and provides valuable insights into the nuances of biomedical language.Originality/valueThe study's findings demonstrate the importance of exploring contextual meanings and proposing new definitions of terms in the biomedical field to improve natural language processing techniques.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call