Abstract

Most traditional distributional similarity models fail to capture syntagmatic patterns that group together multiple word features within the same joint context. In this work we introduce a novel generic distributional similarity scheme under which the power of probabilistic models can be leveraged to effectively model joint contexts. Based on this scheme, we implement a concrete model which utilizes probabilistic n-gram language models. Our evaluations suggest that this model is particularly wellsuited for measuring similarity for verbs, which are known to exhibit richer syntagmatic patterns, while maintaining comparable or better performance with respect to competitive baselines for nouns. Following this, we propose our scheme as a framework for future semantic similarity models leveraging the substantial body of work that exists in probabilistic language modeling.

Highlights

  • The Distributional Hypothesis is commonly phrased as “words which are similar in meaning occur in similar contexts” (Rubenstein and Goodenough, 1965)

  • It was suggested that the word feature vector approach misses valuable information, which is embedded in the colocation and inter-relations of words within the same context (Ruiz-Casado et al, 2005)

  • Our evaluations suggest that our model is advantageous for measuring semantic similarity for verbs, while maintaining comparable or better performance with respect to competitive baselines for nouns

Read more

Summary

Introduction

The Distributional Hypothesis is commonly phrased as “words which are similar in meaning occur in similar contexts” (Rubenstein and Goodenough, 1965). It was suggested that the word feature vector approach misses valuable information, which is embedded in the colocation and inter-relations of words (e.g. word order) within the same context (Ruiz-Casado et al, 2005) Following this motivation, Ruiz-Casado et al (2005) proposed an alternative compositefeature model, later adopted in (Agirre et al, 2009). This model adopts a richer context representation by considering entire word window contexts as features, while keeping the same computational vector-based model. A single feature that could be retrieved this way for the target word like is “Children cookies and milk” They showed good results on detecting synonyms in the 80 multiple-choice questions TOEFL test. We are not aware of additional works following this approach, of using entire word windows as features

Objectives
Methods
Results
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call