Abstract

Abstract Word sense disambiguation (WSD) is the task of selecting correct sense for an ambiguous word in its context. Since WSD is one of the most challenging tasks in various text processing systems, improving its accuracy can be very beneficial. In this article, we propose a new unsupervised method based on co-occurrence graph created by monolingual corpus without any dependency on the structure and properties of the language itself. In the proposed method, the context of an ambiguous word is represented as a sub-graph extracted from a large word co-occurrence graph built based on a corpus. Most of the words are connected in this graph. To clarify the exact sense of an ambiguous word, its senses and relations are added to the context graph, and various similarity functions are employed based on the senses and context graph. In the disambiguation process, we select senses with highest similarity to the context graph. As opposite to other WSD methods, the proposed method does not use any language-dependent resources (e.g. WordNet) and it just uses a monolingual corpus. Therefore, the proposed method can be employed for other languages. Moreover, by increasing the size of corpus, it is possible to enhance the accuracy of WSD. Experimental results on English and Persian datasets show that the proposed method is competitive with existing supervised and unsupervised WSD approaches.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.