Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints

Nikola Mrkšić,Milica Gašić,Roi Reichart,Ivan Vulić,Diarmuid Ó Séaghdha,Anna Korhonen,Ira Leviant,Steve Young

doi:10.1162/tacl_a_00063

Abstract

We present Attract-Repel, an algorithm for improving the semantic quality of word vectors by injecting constraints extracted from lexical resources. Attract-Repel facilitates the use of constraints from mono- and cross-lingual resources, yielding semantically specialized cross-lingual vector spaces. Our evaluation shows that the method can make use of existing cross-lingual lexicons to construct high-quality vector spaces for a plethora of different languages, facilitating semantic transfer from high- to lower-resource ones. The effectiveness of our approach is demonstrated with state-of-the-art results on semantic similarity datasets in six languages. We next show that Attract-Repel-specialized vectors boost performance in the downstream task of dialogue state tracking (DST) across multiple languages. Finally, we show that cross-lingual vector spaces produced by our algorithm facilitate the training of multilingual DST models, which brings further performance improvements.

Highlights

Word representation learning has become a research area of central importance in modern natural language processing
We introduce a new algorithm, ATTRACT-REPEL, that uses synonymy and antonymy constraints drawn from lexical resources to tune word vector spaces using linguistic information that is difficult to capture with conventional distributional training
We investigate the extent to which semantic specialization can empower dialogue state tracking (DST) models which do not rely on such dictionaries

Summary

Introduction

Word representation learning has become a research area of central importance in modern natural language processing. Methods that go beyond stand-alone unsupervised learning have gained increased popularity. These models typically build on distributional ones by using human- or automatically-constructed knowledge bases to enrich the semantic content of existing word vector collections. Often this is done as a postprocessing step, where the distributional word vectors are refined to satisfy constraints extracted from a lexical resource such as WordNet (Faruqui et al, 2015; Wieting et al, 2015; Mrkšicet al., 2016).

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Transactions of the Association for Computational Linguistics	Publication Date: Dec 1, 2017
Citations: 228	License type: cc-by

R Discovery Prime

R Discovery Prime

Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Transactions of the Association for Computational Linguistics

Lead the way for us

Similar Papers

Zero-shot language extension for dialogue state tracking via pre-trained models and multi-auxiliary-tasks fine-tuning
Lu Xiang ... Chengqing Zong
Knowledge-Based Systems | VOL. 259
Lu Xiang, et. al.Lu Xiang ... Chengqing Zong
17 Oct 2022
Knowledge-Based Systems | VOL. 259

Post-Specialisation: Retrofitting Vectors of Words Unseen in Lexical Resources
Ivan Vulić ... Goran Glavaš
-
Ivan Vulić, et. al.Ivan Vulić ... Goran Glavaš
01 Jan 2018
01 Jan 2018

Dialog State Tracking for Unseen Values Using an Extended Attention Mechanism
Takami Yoshida ... Hiroshi Fujimura
-
Takami Yoshida, et. al.Takami Yoshida ... Hiroshi Fujimura
01 Jan 2019
01 Jan 2019

A Stack-Propagation Framework With Slot Filling for Multi-Domain Dialogue State Tracking.
Yufan Wang ... Rui Fan
IEEE transactions on neural networks and learning systems | VOL. PP
Yufan Wang, et. al.Yufan Wang ... Rui Fan
01 Jan 2024
IEEE transactions on neural networks and learning systems | VOL. PP

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Transactions of the Association for Computational Linguistics