Abstract
In this paper, we describe our proposed method for measuring semantic similarity for a given pair of words at SemEval-2017 monolingual semantic word similarity task. We use a combination of knowledge-based and corpus-based techniques. We use FarsNet, the Persian Word Net, besides deep learning techniques to extract the similarity of words. We evaluated our proposed approach on Persian (Farsi) test data at SemEval-2017. It outperformed the other participants and ranked the first in the challenge.
Highlights
Semantic similarity represents a special case of semantic relatedness: for example, cars and gasoline would seem to be more closely related than, say, cars and bicycles, but the latter pair are certainly more similar(Resnik et al, 1999)
Semantic similarity has been used in many application in natural language processing
At SemEval-2017 monolingual semantic word similarity task, given a pair of words, we have to automatically measure their semantic similarity and score them according to a [0-4] similarity scale where 4 denotes that the two words are synonymous and 0 indicates that they are completely dissimilar(Camacho-Collados et al, 2017)
Summary
Semantic similarity represents a special case of semantic relatedness: for example, cars and gasoline would seem to be more closely related than, say, cars and bicycles, but the latter pair are certainly more similar(Resnik et al, 1999). Semantic similarity has been used in many application in natural language processing. At SemEval-2017 monolingual semantic word similarity task, given a pair of words, we have to automatically measure their semantic similarity and score them according to a [0-4] similarity scale where 4 denotes that the two words are synonymous and 0 indicates that they are completely dissimilar(Camacho-Collados et al, 2017). In subtask 1 in which we participated, the two words in the pair belong to the same language. This subtask provides five monolingual word similarity datasets in English, German, Italian, Spanish and Farsi.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have