Abstract

In this paper, we describe our proposed method for measuring semantic similarity for a given pair of words at SemEval-2017 monolingual semantic word similarity task. We use a combination of knowledge-based and corpus-based techniques. We use FarsNet, the Persian Word Net, besides deep learning techniques to extract the similarity of words. We evaluated our proposed approach on Persian (Farsi) test data at SemEval-2017. It outperformed the other participants and ranked the first in the challenge.

Highlights

  • Semantic similarity represents a special case of semantic relatedness: for example, cars and gasoline would seem to be more closely related than, say, cars and bicycles, but the latter pair are certainly more similar(Resnik et al, 1999)

  • Semantic similarity has been used in many application in natural language processing

  • At SemEval-2017 monolingual semantic word similarity task, given a pair of words, we have to automatically measure their semantic similarity and score them according to a [0-4] similarity scale where 4 denotes that the two words are synonymous and 0 indicates that they are completely dissimilar(Camacho-Collados et al, 2017)

Read more

Summary

Introduction

Semantic similarity represents a special case of semantic relatedness: for example, cars and gasoline would seem to be more closely related than, say, cars and bicycles, but the latter pair are certainly more similar(Resnik et al, 1999). Semantic similarity has been used in many application in natural language processing. At SemEval-2017 monolingual semantic word similarity task, given a pair of words, we have to automatically measure their semantic similarity and score them according to a [0-4] similarity scale where 4 denotes that the two words are synonymous and 0 indicates that they are completely dissimilar(Camacho-Collados et al, 2017). In subtask 1 in which we participated, the two words in the pair belong to the same language. This subtask provides five monolingual word similarity datasets in English, German, Italian, Spanish and Farsi.

Related Works
The Proposed Method
Corpus-based Method
Knowledge-based Methods
Gloss-Hyper
Experimental Results
Conclusions and Future Work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call