Abstract

Despite the advances in information processing systems, word-sense disambiguation tasks are far to be satisfactory as testified by numerous limitations of current translation systems and text inference systems. This paper attempts to investigate new techniques in knowledge based word-sense disambiguation field. First, by exploring the WordNet lexical database and part-of-speech conversion through the established CatVar database that translates all non-noun words into their noun counterparts, and following the spirit of Lesk's disambiguation algorithm, a new disambiguation algorithm that maximizes the overall semantic similarity in the sense of Wu and Palmer measure between each sense of the target word and synsets of words of the context, is established. Second, motivated by the existence of WordNet domains for individual synsets, an overlapping based approach that quantifies the set intersection of synset domains, if not empty, or the hierarchy structure of the domains links through a simple path-length measure is put forward. Third, instead of exploring the whole set of words involved in the context, a selective approach that uses syntactic feature as outputted by Stanford Parser and a fixed length windowing is developed. The developed algorithms are evaluated according to two commonly employed dataset where a clear improvement to the baseline algorithm has been acknowledged.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.