Measuring and Improving Faithfulness of Attention in Neural Machine Translation

Pooya Moradi,Anoop Sarkar,Nishant Kambhatla

doi:10.18653/v1/2021.eacl-main.243

Pooya Moradi, Anoop Sarkar + Show 1 more

Open Access

https://doi.org/10.18653/v1/2021.eacl-main.243

Copy DOI

Publication Date: Jan 1, 2021
Citations: 2	License type: cc-by

Affiliation: University of Tehran, Simon Fraser University

Abstract

While the attention heatmaps produced by neural machine translation (NMT) models seem insightful, there is little evidence that they reflect a model’s true internal reasoning. We provide a measure of faithfulness for NMT based on a variety of stress tests where attention weights which are crucial for prediction are perturbed and the model should alter its predictions if the learned weights are a faithful explanation of the predictions. We show that our proposed faithfulness measure for NMT models can be improved using a novel differentiable objective that rewards faithful behaviour by the model through probability divergence. Our experimental results on multiple language pairs show that our objective function is effective in increasing faithfulness and can lead to a useful analysis of NMT model behaviour and more trustworthy attention heatmaps. Our proposed objective improves faithfulness without reducing the translation quality and has a useful regularization effect on the NMT model and can even improve translation quality in some cases.

Highlights

How trustworthy are our neural models? This question has led to a wide variety of contemporary NLP research focusing on (a) different axes of interpretability including plausibility (Herman, 2017; Lage et al, 2019) and faithfulness (Lipton, 2018; Jacovi and Goldberg, 2020b), (b) interpretation of the neural model components (Belinkov et al, 2017; Dalvi et al, 2017; Vig and Belinkov, 2019), (c) explaining the decisions made by neural models to humans (Ribeiro et al, 2016; Li et al, 2016; Ding et al, 2017; Ghaeini et al, 2018; Bastings et al, 2019; Jain et al, 2020), and (d) evaluating different explanation methods from different perspectives attention weights
Our findings show that our objective is effective in increasing faithfulness and can lead to a useful analysis of neural machine translation (NMT) model behaviour and more trustworthy attention heatmaps
We introduce a novel learning objective based on probability divergence that rewards faithful behavior and which can be included in the training objective for NMT

Summary

Introduction

How trustworthy are our neural models? This question has led to a wide variety of contemporary NLP research focusing on (a) different axes of interpretability including plausibility (or interchangeably human-interpretability) (Herman, 2017; Lage et al, 2019) and faithfulness (Lipton, 2018; Jacovi and Goldberg, 2020b), (b) interpretation of the neural model components (Belinkov et al, 2017; Dalvi et al, 2017; Vig and Belinkov, 2019), (c) explaining the decisions made by neural models to humans (using explanations, highlights, rationales, etc.) (Ribeiro et al, 2016; Li et al, 2016; Ding et al, 2017; Ghaeini et al, 2018; Bastings et al, 2019; Jain et al, 2020), and (d) evaluating different explanation methods from different perspectives attention weights. We focus on faithfulness which intuitively provides the extent to which an explanation accurately represents the true reasoning behind a prediction It is important for NLP practitioners who wish to debug their neural models and improve them. Jacovi and Goldberg (2020b) emphasize distinguishing faithfulness from human-interpretability in interpretability research by providing several clarifications about the terminology used by researchers. Jacovi and Goldberg (2020a) Aligned with these criteria, we study faithfulness of NLP neural models, NMT models. We provide a faithfulness measure that is computed based on a variety of stress tests where attention weights that are crucial for prediction are perturbed. We propose a novel differentiable objective based on probability divergence and study its effect on the discrete faithfulness measure. We expect larger and overparameterized models to get worse in terms of faithfulness because the language model in the decoder gets stronger in guessing the word which, as we shall discuss in more detail later, tends to make attention less faithful

Faithfulness in NMT Models

Approach

Divergence-based Faithfulness Objective

On Attention Sparsity

Datasets

Architecture and Hyperparameters

Training Difficulties

Impact on Faithfulness

POS-tag Analysis

Effect of Training With Single Adversary on Passing Other Stress Tests

Regularization Effect

Objective

Do the New Models Have Sparser Attention?

Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Measuring and Improving Faithfulness of Attention in Neural Machine Translation

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Confidence Based Bidirectional Global Context Aware Training Framework for Neural Machine Translation
...
-
, et. al. ...
11 May 2022
11 May 2022

Adapting Translation Models for Transcript Disfluency Detection
Qianqian Dong ... Zhen Yang
Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence | VOL. 33
Qianqian Dong, et. al.Qianqian Dong ... Zhen Yang
17 Jul 2019
Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence | VOL. 33

Adversarial Subword Regularization for Robust Neural Machine Translation
Jungsoo Park ... Jaewoo Kang
-
Jungsoo Park, et. al.Jungsoo Park ... Jaewoo Kang
01 Jan 2020
01 Jan 2020

Data Rejuvenation: Exploiting Inactive Training Examples for Neural Machine Translation
Wenxiang Jiao ... Michael Lyu
-
Wenxiang Jiao, et. al.Wenxiang Jiao ... Michael Lyu
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Measuring and Improving Faithfulness of Attention in Neural Machine Translation

Abstract

Highlights

Summary

Talk to us

Similar Papers