Abstract

All the languages belonging to the same language family have a certain number of the common characteristics called language pair phenomena, which can be found quite useful for processing them for multilingual purposes like translation across the cognate languages, building dictionaries, thesauri, transcript collections, or for multilingual text retrieval of digital documents. In addition, it is estimated that more than 30% of English vocabulary has been inherited from Latin, which has dominated medical terminology in particular. We use this fact by exploring word sense disambiguation (WSD) in multilingual environment. Specifically in the medical domain, language pair phenomena can be limited to synonymy of the cognate technical terms. Our approach is investigated based on Boolean and Free Text Search modes on the comparison basis. For measuring the efficiency of our methodology we use the classical Salton model of tf-idf term weighting schemes, however extended by Karen Sparck Jones. Our results are very promising since they indicate that similarity between the synonymous words being English medical terms and their target language equivalents enables significant limitation of the target word senses even those outside the language family like e.g. for the English and Polish language pair phenomena. Such a limitation of the number of target word senses results in better disambiguation and is more context-driven. Also, consequently it translates onto the higher precision in multilingual medical information retrieval.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.