Abstract

This paper is a continuation of work in natural language processing in the medical domain for Croatian. After we have annotated single nouns from our corpus consisting of pharmaceutical instructions for medicaments, we are shifting the focus to multiword expressions (MWEs). The project still relies on the nouns from the previous step to detect MWEs where the noun is the main carrier of the medical meaning. However, in cases where the main noun is more general and not directly associated with the medical domain (e.g., bubrežna funkcija ‘kidney function’), we use the power of NooJ morphology grammar to check if the preceding adjective root is associated with the noun found in the main dictionary and annotated as a medical domain noun. Thus, we are checking if the adjective (endoskopski ‘endoscopic’) has a corresponding noun (endoskopija ‘endoscopy’) that is already marked in the NooJ dictionary as a noun belonging to the medical domain. In such cases, we assume that the adjective belongs to the same domain as the noun and that the attribute for the medical domain can be inherited, not only for the adjective, but for the entire MWE as well.The project hopes to help with the automatic extraction and annotation of single adjectives from the medical domain, but also to help identify medical MWEs. Additionally, we wanted to learn more about who carries the domain-specific meaning in Croatian MWEs.KeywordsMedical domain corpusDetecting MWEDomain specific meaningMorphologySyntaxCroatian languageNooJ

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call