Logical Satisfiability of Counterfactuals for Faithful Explanations in NLI

Suzanna Sia,Lambert Mathias,Amjad Almahairi,Madian Khabsa,Anton Belyy,Luke Zettlemoyer

doi:10.1609/aaai.v37i8.26174

Abstract

Evaluating an explanation's faithfulness is desired for many reasons such as trust, interpretability and diagnosing the sources of model's errors. In this work, which focuses on the NLI task, we introduce the methodology of Faithfulness-through-Counterfactuals, which first generates a counterfactual hypothesis based on the logical predicates expressed in the explanation, and then evaluates if the model's prediction on the counterfactual is consistent with that expressed logic (i.e. if the new formula is \textit{logically satisfiable}). In contrast to existing approaches, this does not require any explanations for training a separate verification model. We first validate the efficacy of automatic counterfactual hypothesis generation, leveraging on the few-shot priming paradigm. Next, we show that our proposed metric distinguishes between human-model agreement and disagreement on new counterfactual input. In addition, we conduct a sensitivity analysis to validate that our metric is sensitive to unfaithful explanations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Logical Satisfiability of Counterfactuals for Faithful Explanations in NLI

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Jun 26, 2023
Citations: 3

Similar Papers

Towards a Bayesian total error analysis of conceptual rainfall-runoff models: Characterising model error using storm-dependent parameters
George Kuczera ... Mark Thyer
Journal of Hydrology | VOL. 331
George Kuczera, et. al.George Kuczera ... Mark Thyer
07 Jul 2006
Journal of Hydrology | VOL. 331

A stochastic approach to model validation
Steven J Luis ... Dennis Mclaughlin
Advances in Water Resources | VOL. 15
Steven J Luis, et. al.Steven J Luis ... Dennis Mclaughlin
01 Jan 1992
Advances in Water Resources | VOL. 15

On Quantification of Error and Uncertainty in Two-zone Models used in Fire Safety Design
Johan Lundin
Journal of Fire Sciences | VOL. 23
Johan LundinJohan Lundin
01 Jul 2005
Journal of Fire Sciences | VOL. 23

A method to assess the performance of growth intercept models in British Columbia
Gordon D Nigh ... Patrick J Martin
The Forestry Chronicle | VOL. 77
Gordon D Nigh, et. al.Gordon D Nigh ... Patrick J Martin
01 Jun 2001
The Forestry Chronicle | VOL. 77

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Logical Satisfiability of Counterfactuals for Faithful Explanations in NLI

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence