Abstract

Lexical entailment (LE; also known as hyponymy-hypernymy or is-a relation) is a core asymmetric lexical relation that supports tasks like taxonomy induction and text generation. In this work, we propose a simple and effective method for fine-tuning distributional word vectors for LE. Our Generalized Lexical ENtailment model (GLEN) is decoupled from the word embedding model and applicable to any distributional vector space. Yet – unlike existing retrofitting models – it captures a general specialization function allowing for LE-tuning of the entire distributional space and not only the vectors of words seen in lexical constraints. Coupled with a multilingual embedding space, GLEN seamlessly enables cross-lingual LE detection. We demonstrate the effectiveness of GLEN in graded LE and report large improvements (over 20% in accuracy) over state-of-the-art in cross-lingual LE detection.

Highlights

  • Background and MotivationLexical entailment (LE; hyponymy-hypernymy or is-a relation), is a fundamental asymmetric lexicosemantic relation (Collins and Quillian, 1972; Beckwith et al, 1991) and a key building block of lexico-semantic networks and knowledge bases (Fellbaum, 1998; Navigli and Ponzetto, 2012)

  • Results in the 0% setting, in which Generalized Lexical ENtailment model (GLEN) improves over the distributional space by more than 30 points most clearly demonstrate its effectiveness

  • GLEN’s specialization function is affected by all constraints and has to work for all words; GLEN trades the effectiveness of LEAR’s word-specific updates for seen words, for the ability to generalize over unseen words

Read more

Summary

Introduction

Background and MotivationLexical entailment (LE; hyponymy-hypernymy or is-a relation), is a fundamental asymmetric lexicosemantic relation (Collins and Quillian, 1972; Beckwith et al, 1991) and a key building block of lexico-semantic networks and knowledge bases (Fellbaum, 1998; Navigli and Ponzetto, 2012). Reasoning about word-level entailment supports a multitude of tasks such as taxonomy induction (Snow et al, 2006; Navigli et al, 2011; Gupta et al, 2017), natural language inference (Dagan et al, 2013; Bowman et al, 2015; Williams et al, 2018), metaphor detection (Mohler et al, 2013), and text generation (Biran and McKeown, 2013) Due to their distributional nature (Harris, 1954), embedding models (Mikolov et al, 2013; Levy and Goldberg, 2014; Pennington et al, 2014; Melamud et al, 2016; Bojanowski et al, 2017; Peters et al, 2018, inter alia) conflate paradigmatic relations (e.g., synonymy, antonymy, LE, meronymy) and the broader topical (i.e., syntagmatic) relatedness (Schwartz et al, 2015; Mrksicet al., 2017). On the downside, retrofitting models specialize only the vectors of words seen in constraints, leaving vectors of unseen words unchanged

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call