Abstract

Automated fact verification has been progressing owing to advancements in modeling and availability of large datasets. Due to the nature of the task, it is critical to understand the vulnerabilities of these systems against adversarial instances designed to make them predict incorrectly. We introduce two novel scoring metrics, attack potency and system resilience which take into account the correctness of the adversarial instances, an aspect often ignored in adversarial evaluations. We consider six fact verification systems from the recent Fact Extraction and VERification (FEVER) challenge: the four best-scoring ones and two baselines. We evaluate adversarial instances generated by a recently proposed state-of-the-art method, a paraphrasing method, and rule-based attacks devised for fact verification. We find that our rule-based attacks have higher potency, and that while the rankings among the top systems changed, they exhibited higher resilience than the baselines.

Highlights

  • Fact verification is the task of predicting whether claims can be supported or refuted by evidence

  • We explore three methods for generating adversarial instances by modifying existing dataset instances: manually crafted rule-based transformation informed by the training data; a recently proposed state-of-the-art method for automatically generating Semantically Equivalent Adversarial Rules (Ribeiro et al, 2018, SEARs) that is model targeted; and a lexicallyinformed method for paraphrasing model

  • As automated means for generating adversarial instances do not always produce grammatical or correctly labelled instances, we developed a method for evaluation which incorporates instance correctness

Read more

Summary

Introduction

Fact verification is the task of predicting whether claims can be supported or refuted by evidence Advances in this task have been achieved through improved modelling and the availability of resources to train and validate systems Wang (2017); Baly et al (2018b); Thorne et al (2018)) As this is a task with potentially sensitive applications like propaganda (Baly et al, 2018a) or biased news detection (Potthast et al, 2018), it is critical to understand how systems and models behave when exposed to real-world data and how deficiencies in their training data may contribute to this. Adversarial SUPPORTED Instance: Bullitt is not a movie directed by Phillip D’Antoni

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.