МОДИФИЦИРОВАННЫЙ МЕТОД УСТРАНЕНИЯ НЕОДНОЗНАЧНОСТИ СМЫСЛА СЛОВ, ОСНОВАННЫЙ НА МЕТОДАХ РАСПРЕДЕЛЕННОГО ПРЕДСТАВЛЕНИЯ

Y.A Kravchenko,J.H Mohammad,А.М Mansour

doi:10.18522/2311-3103-2021-3-92-101

Abstract

In the text mining tasks, textual representation should be not only efficient but also interpretable,as this enables an understanding of the operational logic underlying the data miningmodels. This paper describes a modified Word Sense Disambiguation (WSD) method which extendstwo well-known variations of the Lesk WSD approach. Given a word and its context, Leskbases its calculations on the overlap between the context of a word and each definition of its senses(gloss) in order to select the proper meaning. The main contribution of the proposed method isthe adoption of the concept of “similarity” between definition and context instead of "overlap", inaddition to expanding the definition with examples provided by WordNet for each sense of thetarget word. The proposed method is also characterized by the use of text similarity measurementfunctions defined in a distributed semantic space. The proposed method has been tested on fivedifferent benchmark datasets for words sense disambiguation tasks and compared with severalbasic methods, including simple Lesk, extended Lesk, WordNet 1st sense, Babelfy and UKB. Theresults show that proposed method outperforms most basic methods with the exception of Babelfyand the WN 1st sense methods.

Full Text