Abstract

The recent growth in the popularity and success of deep learning models on NLP classification tasks has accompanied the need for generating some form of natural language explanation of the predicted labels. Such generated natural language (NL) explanations are expected to be faithful, i.e., they should correlate well with the model’s internal decision making. In this work, we focus on the task of natural language inference (NLI) and address the following question: can we build NLI systems which produce labels with high accuracy, while also generating faithful explanations of its decisions? We propose Natural-language Inference over Label-specific Explanations (NILE), a novel NLI method which utilizes auto-generated label-specific NL explanations to produce labels along with its faithful explanation. We demonstrate NILE’s effectiveness over previously reported methods through automated and human evaluation of the produced labels and explanations. Our evaluation of NILE also supports the claim that accurate systems capable of providing testable explanations of their decisions can be designed. We discuss the faithfulness of NILE’s explanations in terms of sensitivity of the decisions to the corresponding explanations. We argue that explicit evaluation of faithfulness, in addition to label and explanation accuracy, is an important step in evaluating model’s explanations. Further, we demonstrate that task-specific probes are necessary to establish such sensitivity.

Highlights

  • Deep learning methods have been employed to improve performance on several benchmark classification tasks in NLP (Wang et al, 2018, 2019)

  • We focus on producing natural language explanations for Natural Language Inference (NLI), without sacrificing much on label accuracy

  • Through Natural-language Inference over Label-specific Explanations (NILE), we propose a framework for generating faithful natural language explanations by requiring the model to condition on generated natural language explanations

Read more

Summary

Introduction

Deep learning methods have been employed to improve performance on several benchmark classification tasks in NLP (Wang et al, 2018, 2019) These models aim at improving label accuracy, while it is often desirable to produce explanations for these decisions (Lipton, 2016; Chakraborty et al, 2017). SNLI: The Stanford NLI dataset (Bowman et al, 2015) contains samples of premise and hypothesis pairs with human annotations, using Amazon Mechanical Turk. Annotators were first asked to highlight words in the premise and hypothesis pairs which could explain the labels They were asked to write a natural language explanation using the highlighted words

Objectives
Methods
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.