Retrofitting Contextualized Word Embeddings with Paraphrases

Weijia Shi,Kai-Wei Chang,Muhao Chen,Pei Zhou

doi:10.18653/v1/d19-1113

Abstract

Contextualized word embeddings, such as ELMo, provide meaningful representations for words and their contexts. They have been shown to have a great impact on downstream applications. However, we observe that the contextualized embeddings of a word might change drastically when its contexts are paraphrased. As these embeddings are over-sensitive to the context, the downstream model may make different predictions when the input sentence is paraphrased. To address this issue, we propose a post-processing approach to retrofit the embedding with paraphrases. Our method learns an orthogonal transformation on the input space of the contextualized word embedding model, which seeks to minimize the variance of word representations on paraphrased contexts. Experiments show that the proposed method significantly improves ELMo on various sentence classification and inference tasks.

Highlights

Contextualized word embeddings have shown to be useful for a variety of downstream tasks (Peters et al, 2018, 2017; McCann et al, 2017)
(2) ELMo encodes a sentence to a 1,024 dimensional vector by averaging the representations of the top layer. We compare these baselines with four variants of paraphrase-aware retrofitting (PAR) built upon ELMo that trained on different paraphrase corpora
Similar to the sentence classification tasks, we apply a Multi-Layer Perceptron (MLP) with the same hyperparameters to conduct the classification

Summary

Introduction

Contextualized word embeddings have shown to be useful for a variety of downstream tasks (Peters et al, 2018, 2017; McCann et al, 2017). Unlike traditional word embeddings that represent words with fixed vectors, these embedding models encode both words and their contexts and generate context-specific representations. While contextualized embeddings are useful, we observe that a language model-based embedding model, ELMo (Peters et al, 2018), cannot accurately capture the semantic equivalence of contexts. In cases where the contexts of a word have equivalent or similar meanings but are changed in sentence formation or word order, ELMo may assign very different representations to the word.

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Retrofitting Contextualized Word Embeddings with Paraphrases

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2019
Citations: 44	License type: cc-by

Similar Papers

Comparing general and specialized word embeddings for biomedical named entity recognition.
Rigo E Ramos-Vargas ... Israel Román-Godínez
PeerJ. Computer science | VOL. 7
Rigo E Ramos-Vargas, et. al.Rigo E Ramos-Vargas ... Israel Román-Godínez
18 Feb 2021
PeerJ. Computer science | VOL. 7

Contextual Word Embeddings and Topic Modeling in Healthy Dieting and Obesity.
Vijaya Kumari Yeruva ... Yugyung Lee
Journal of healthcare informatics research | VOL. 3
Vijaya Kumari Yeruva, et. al.Vijaya Kumari Yeruva ... Yugyung Lee
01 Jun 2019
Journal of healthcare informatics research | VOL. 3

Examining the effect of whitening on static and contextualized word embeddings
Shota Sasaki ... Kentaro Inui
Information Processing and Management | VOL. 60
Shota Sasaki, et. al.Shota Sasaki ... Kentaro Inui
24 Jan 2023
Information Processing and Management | VOL. 60

Evaluating the Underlying Gender Bias in Contextualized Word Embeddings
Christine Basta ... Marta R Costa-Jussà
-
Christine Basta, et. al.Christine Basta ... Marta R Costa-Jussà
01 Jan 2019
01 Jan 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Retrofitting Contextualized Word Embeddings with Paraphrases

Abstract

Highlights

Summary

Talk to us

Similar Papers