Abstract

Abstract We demonstrate several ways to use morphological word analogies to examine the representation of complex words in semantic vector spaces. We present a set of morphological relations, each of which can be used to generate many word analogies. 1. We show that the difference-vectors for pairs which have the same relation to each other are similarly aligned. 2. We suggest that addition of difference-vectors is a useful phrase-building operator. 3. We propose that pairs in the same relation may have similar relative frequencies. 4. We suggest that homographs, which necessarily have the same semantic vectors, can sometimes be separated into different vectors for different senses, using frequency estimates and alignment constraints obtained from word analogies. 5. We observe that some of our analogies seem to be parallel, and might be combined. We use Arabic words as a case study, because Arabic orthography includes verb conjugations, object pronouns, definitive articles, possessive pronouns, and some prepositions in single word-forms. Therefore, a number of short phrases, built up of easily perceived constituents, are already present in stock semantic spaces for Arabic available on the web. Similar phrases in English would require including bigrams or trigrams as lemmas in the word embedding, although English derivational morphology allows for other relationships in standard semantic spaces which Arabic does not, for example negation. We make our corpus of morphological relations available to other researchers.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call