Don’t Take the Premise for Granted: Mitigating Artifacts in Natural Language Inference

Yonatan Belinkov,Stuart Shieber,Adam Poliak,Benjamin Van Durme,Alexander Rush

doi:10.18653/v1/p19-1084

Abstract

Natural Language Inference (NLI) datasets often contain hypothesis-only biases—artifacts that allow models to achieve non-trivial performance without learning whether a premise entails a hypothesis. We propose two probabilistic methods to build models that are more robust to such biases and better transfer across datasets. In contrast to standard approaches to NLI, our methods predict the probability of a premise given a hypothesis and NLI label, discouraging models from ignoring the premise. We evaluate our methods on synthetic and existing NLI datasets by training on datasets containing biases and testing on datasets containing no (or different) hypothesis-only biases. Our results indicate that these methods can make NLI models more robust to dataset-specific artifacts, transferring better than a baseline architecture in 9 out of 12 NLI datasets. Additionally, we provide an extensive analysis of the interplay of our methods with known biases in NLI datasets, as well as the effects of encouraging models to ignore biases and fine-tuning on target datasets.

Highlights

Natural Language Inference (NLI) is often used to gauge a model’s ability to understand a relationship between two texts (Cooper et al, 1996; Dagan et al, 2006)
We report results of the proposed methods on the Stanford Natural Language Inference dataset (SNLI) test set
As our results improve on the target datasets, we note that Method 1’s performance on SNLI does not drastically decrease, even when the improvement on the target dataset is large

Summary

Introduction

Natural Language Inference (NLI) is often used to gauge a model’s ability to understand a relationship between two texts (Cooper et al, 1996; Dagan et al, 2006). In NLI, a model is tasked with determining whether a hypothesis (a woman is sleeping) would likely be inferred from a premise (a woman is talking on the phone).. The development of new large-scale datasets has led to a flurry of various neural network architectures for solving NLI. Recent work has found that many NLI datasets contain biases, or annotation artifacts, i.e., features present in hypotheses that enable models to perform surprisingly well using only the hypothesis, without learning the relationship between two texts (Gururangan et al, 2018; Poliak et al, 2018b; Tsuchiya, 2018).. As a ramification of such biases, models may not generalize well to other datasets that contain different or no such biases

Objectives

Methods

Results

Conclusion