Abstract

We investigate the effect of various dependency-based word embeddings on distinguishing between functional and domain similarity, word similarity rankings, and two downstream tasks in English. Variations include word embeddings trained using context windows from Stanford and Universal dependencies at several levels of enhancement (ranging from unlabeled, to Enhanced++ dependencies). Results are compared to basic linear contexts and evaluated on several datasets. We found that embeddings trained with Universal and Stanford dependency contexts excel at different tasks, and that enhanced dependencies often improve performance.

Highlights

  • For many natural language processing applications, it is important to understand word-level semantics

  • Word embeddings trained with neural networks have gained popularity (Mikolov et al, 2013; Pennington et al, 2014), and have been successfully used for various tasks, such as machine translation (Zou et al, 2013) and information retrieval (Hui et al, 2017)

  • Levy and Goldberg (2014) challenged the use of linear contexts, proposing instead to use contexts based on dependency parses. (This is akin to prior work that found that dependency contexts are useful for vector models (Pado and Lapata, 2007; Baroni and Lenci, 2010).) They found that embeddings trained this way are better at capturing semantic similarity, rather than relatedness

Read more

Summary

Introduction

For many natural language processing applications, it is important to understand word-level semantics. (This is akin to prior work that found that dependency contexts are useful for vector models (Pado and Lapata, 2007; Baroni and Lenci, 2010).) They found that embeddings trained this way are better at capturing semantic similarity, rather than relatedness. Embeddings trained using linear contexts place Hogwarts (the fictional setting of the Harry Potter series) near Dumbledore (a character from the series), whereas embeddings trained with dependency contexts place Hogwarts near Sunnydale (fictional setting of the series Buffy the Vampire Slayer). The former is relatedness, whereas the latter is similarity

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call