Word Sense Disambiguation (WSD) is one of the earliest problems in natural language processing which aims to determine the correct sense of words in context. The semantic information provided by WSD systems is highly beneficial to many tasks such as machine translation, information extraction, and semantic parsing. In this work, a new approach for WSD is proposed which uses a neural network as a surrogate fitness function in a metaheuristic algorithm. Also, a new method for simultaneous training of word and sense embeddings is proposed in this work. Accordingly, the node2vec algorithm is employed on the WordNet graph to generate sequences containing both words and senses. These sequences are then used along with paragraphs from Wikipedia in the word2vec algorithm to generate embeddings for words and senses at the same time. In order to address data imbalance in this task, sense probability distribution data extracted from the training corpus is used in the search process of the proposed simulated annealing algorithm. Furthermore, we introduce a new approach for clustering and mapping senses in the WordNet graph, which considerably improves the accuracy of the proposed method. In this approach, nodes in the WordNet graph are clustered on the condition that no two senses of the same word be present in one cluster. Then, repeatedly, all nodes in each cluster are mapped to a randomly selected node from that cluster, meaning that the representative node can take advantage of the training instances of all the other nodes in the cluster. Training the proposed method in this work is done using the SemCor dataset and the SemEval-2015 dataset has been used as the validation set. The final evaluation of the system is performed on SensEval-2, SensEval-3, SemEval-2007, SemEval-2013, SemEval-2015, and the concatenation of all five mentioned datasets. The performance of the system is also evaluated on the four content word categories, namely, nouns, verbs, adjectives, and adverbs. Experimental results show that the proposed method achieves accuracies in the range of 74.8 to 84.6 percent in the ten aforementioned evaluation categories which are close to and in some cases better than the state of the art in this task.
Read full abstract