Abstract
Word embeddings obtained from neural network models such as Word2Vec Skipgram have become popular representations of word meaning and have been evaluated on a variety of word similarity and relatedness norming data. Skipgram generates a set of word and context embeddings, the latter typically discarded after training. We demonstrate the usefulness of context embeddings in predicting asymmetric association between words from a recently published dataset of production norms (Jouravlev & McRae, 2016). Our findings suggest that humans respond with words closer to the cue within the context embedding space (rather than the word embedding space), when asked to generate thematically related words.
Highlights
Modern distributional semantic models such as Word2Vec (Mikolov et al, 2013a,b) and GloVe (Pennington et al, 2014) have been evaluated on a variety of word similarity and relatedness datasets
It is likely that the human ratings were affected by co-occurrence information encoded in word embeddings and in context embeddings
We proposed several measures for complementary similarity and relatedness judgments computed based on these embeddings
Summary
Modern distributional semantic models such as Word2Vec (Mikolov et al, 2013a,b) and GloVe (Pennington et al, 2014) have been evaluated on a variety of word similarity and relatedness datasets. Similarity between two words is often assumed to be a direction-less measure (e.g., car and truck are similar due to feature overlap), whereas relatedness is inherently directional (e.g., broom and floor share a functional relationship). It is well established in human behavioral data that similarity and relatedness judgments are both asymmetric. The distinction between similarity and relatedness, and the asymmetry of the judgments have typically been ignored in recent evaluations of popular embedding models
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.