Abstract

Vector space word representations are learned from distributional information of words in large corpora. Although such statistics are semantically informative, they disregard the valuable information that is contained in semantic lexicons such as WordNet, FrameNet, and the Paraphrase Database. This paper proposes a method for refining vector space representations using relational information from semantic lexicons by encouraging linked words to have similar vector representations, and it makes no assumptions about how the input vectors were constructed. Evaluated on a battery of standard lexical semantic evaluation tasks in several languages, we obtain substantial improvements starting with a variety of word vector models. Our refinement method outperforms prior techniques for incorporating semantic lexicons into word vector training algorithms.

Highlights

  • Data-driven learning of word vectors that capture lexico-semantic information is a technique of central importance in NLP

  • FrameNet’s performance is weaker, in some cases leading to worse performance

  • Both these models use constraints among words as a regularization term on the training objective during training, and their methods can only be applied for improving the quality of Skip-Gram Vectors (SG) and CBOW vectors produced by the word2vec tool

Read more

Summary

Introduction

A variety of approaches for constructing vector space embeddings of vocabularies are in use, notably including taking low rank approximations of cooccurrence statistics (Deerwester et al, 1990) and using internal representations from neural network models of word sequences (Collobert and Weston, 2008). Because of their value as lexical semantic representations, there has been much research on improving the quality of vectors. Examples of such resources include WordNet (Miller, 1995), FrameNet (Baker et al, 1998) and the Paraphrase Database (Ganitkevitch et al, 2013)

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call