Abstract

We present LEAR (Lexical Entailment Attract-Repel), a novel post-processing method that transforms any input word vector space to emphasise the asymmetric relation of lexical entailment (LE), also known as the IS-A or hyponymy-hypernymy relation. By injecting external linguistic constraints (e.g., WordNet links) into the initial vector space, the LE specialisation procedure brings true hyponymy-hypernymy pairs closer together in the transformed Euclidean space. The proposed asymmetric distance measure adjusts the norms of word vectors to reflect the actual WordNet-style hierarchy of concepts. Simultaneously, a joint objective enforces semantic similarity using the symmetric cosine distance, yielding a vector space specialised for both lexical relations at once. LEAR specialisation achieves state-of-the-art performance in the tasks of hypernymy directionality, hypernymy detection, and graded lexical entailment, demonstrating the effectiveness and robustness of the proposed asymmetric specialisation model.

Highlights

  • Word representation learning has become a research area of central importance in NLP, with its usefulness demonstrated across application areas such as parsing (Chen and Manning, 2014), machine translation (Zou et al, 2013), and many others (Turian et al, 2010; Collobert et al, 2011)

  • This is often done as a post-processing step, where distributional vectors are gradually refined to satisfy linguistic constraints extracted from lexical resources such as WordNet (Faruqui et al, 2015; Mrkšicet al., 2016), the Paraphrase Database (PPDB) (Wieting et al, 2015), or BabelNet (Mrkšicet al., 2017; Vulicet al., 2017a)

  • The original paper of Kiela et al (2015b) reports the following best scores on each task: 0.88 (BLESS), 0.75 (WBLESS), 0.57 (BIBLESS). These scores were recently surpassed by Nguyen et al (2017), who, instead of post-processing, combine WordNet-based constraints with an SGNS-style objective into a joint model

Read more

Summary

Introduction

Word representation learning has become a research area of central importance in NLP, with its usefulness demonstrated across application areas such as parsing (Chen and Manning, 2014), machine translation (Zou et al, 2013), and many others (Turian et al, 2010; Collobert et al, 2011). A popular solution is to go beyond stand-alone unsupervised learning and fine-tune distributional vector spaces by using external knowledge from human- or automaticallyconstructed knowledge bases. This is often done as a post-processing step, where distributional vectors are gradually refined to satisfy linguistic constraints extracted from lexical resources such as WordNet (Faruqui et al, 2015; Mrkšicet al., 2016), the Paraphrase Database (PPDB) (Wieting et al, 2015), or BabelNet (Mrkšicet al., 2017; Vulicet al., 2017a). One advantage of post-processing methods is that they treat the input vector space as a black box, making them applicable to any input space

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call