Learning Natural Language Inference with LSTM

Shuohang Wang,Jing Jiang

doi:10.18653/v1/n16-1170

Abstract

Natural language inference (NLI) is a fundamentally important task in natural language processing that has many applications. The recently released Stanford Natural Language Inference (SNLI) corpus has made it possible to develop and evaluate learning-centered methods such as deep neural networks for natural language inference (NLI). In this paper, we propose a special long short-term memory (LSTM) architecture for NLI. Our model builds on top of a recently proposed neural attention model for NLI but is based on a significantly different idea. Instead of deriving sentence embeddings for the premise and the hypothesis to be used for classification, our solution uses a match-LSTM to perform word-by-word matching of the hypothesis with the premise. This LSTM is able to place more emphasis on important word-level matching results. In particular, we observe that this LSTM remembers important mismatches that are critical for predicting the contradiction or the neutral relationship label. On the SNLI corpus, our model achieves an accuracy of 86.1%, outperforming the state of the art.

Highlights

Natural language inference (NLI) is the problem of determining whether from a premise sentence P one can infer another hypothesis sentence H (MacCartney, 2009)
Experiments show that our mLSTM model achieves an accuracy of 86.1% on the Stanford Natural Language Inference (SNLI) corpus, outperforming the state of the art
(3) The performance of mLSTM with bi-long short-term memory (LSTM) sentence modeling compared with the model with standard LSTM sentence modeling when d is set to 150 shows that using bi-LSTM to process the original sentences helps

Summary

Introduction

Natural language inference (NLI) is the problem of determining whether from a premise sentence P one can infer another hypothesis sentence H (MacCartney, 2009). Bowman et al (2015) released the Stanford Natural Language Inference (SNLI) corpus for the purpose of encouraging more learning-centered approaches to NLI. This corpus contains around 570K sentence pairs with three labels: entailment, contradiction and neutral. Bowman et al (2015) tested a straightforward architecture of deep neural networks for NLI In their architecture, the premise and the hypothesis are each represented by a sentence embedding vector. There is a set of internal vectors, including an input gate ik, a forget gate fk, an output gate ok and a memory cell ck.

Objectives

Methods

Results

Conclusion