Annotating and Modeling Fine-grained Factuality in Summarization

Tanya Goyal,Greg Durrett

doi:10.18653/v1/2021.naacl-main.114

Tanya Goyal, Greg Durrett

Open Access

PDF Available

https://doi.org/10.18653/v1/2021.naacl-main.114

Copy DOI

Export

Save

Cite

Publication Date: Jan 1, 2021
Citations: 17	License type: cc-by

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Recent pre-trained abstractive summarization systems have started to achieve credible performance, but a major barrier to their use in practice is their propensity to output summaries that are not faithful to the input and that contain factual errors. While a number of annotated datasets and statistical models for assessing factuality have been explored, there is no clear picture of what errors are most important to target or where current techniques are succeeding and failing. We explore both synthetic and human-labeled data sources for training models to identify factual errors in summarization, and study factuality at the word-, dependency-, and sentence-level. Our observations are threefold. First, exhibited factual errors differ significantly across datasets, and commonly-used training sets of simple synthetic errors do not reflect errors made on abstractive datasets like XSum. Second, human-labeled data with fine-grained annotations provides a more effective training signal than sentence-level annotations or synthetic data. Finally, we show that our best factuality detection model enables training of more factual XSum summarization models by allowing us to identify non-factual tokens in the training data.

Highlights

In this paper, we aim to answer two main questions
Human-labeled data with fine-grained annotations provides a more effective training signal than sentence-level annotations or synmade by generation models? We find the answer is no: techniques using surface-level data corruption (Kryscinski et al, 2020; Zhao et al, 2020; Cao et al, 2020) or paraphrasing (Goyal and Durrett, 2020a) target inherently different error distributions than those seen in actual model generations, and factuality models trained on these datasets perform poorly in practice
We show that our best factuality detection model enables training of more factual XSUM summarization models by allowing us to identify non-factual tokens in the training data.1 that different summarization domains, CNN/Daily Mail (Hermann et al, 2015; Nallapati et al, 2016) and XSum (Narayan et al, 2018), exhibit substantially different error distributions in

Summary

Training Datasets to Compare

British Broadcasting Corporation (BBC) articles, We call this set of approaches entity-centric where the first sentence of the article is treated as because the transformations largely focus on pera summary of the article. The approach from els trained on this dataset have to learn to model Kryscinski et al (2020) has the broadest set of long-range dependencies and may still be unable transformations out of this line of prior work, so to recover all information in the gold summary. In addition to sentence-level annotations, this approach extracts factuality labels corresponding to each dependency arc of the generated summary. To adapt this data creation approach for our current experimental setting, we generated paraphrases of gold summaries using the paraphrase generation model of Goyal and Durrett (2020b). We generate 40k training examples for both CNN/DM and XSUM domains

Types of supervision

Analysis of Error Types

Results

Sentence-Factuality Model

Arc-Factuality model

Evaluation of Synthetic Training Datasets

Downstream Applications

Localization of errors

Conclusion

A Manual Annotation of Errors

D Implementation Details

E Human Study

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Annotating and Modeling Fine-grained Factuality in Summarization

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Annotating and Modeling Fine-grained Factuality in Summarization
...
-
, et. al. ...
25 May 2021
25 May 2021

Using influence functions to identify potential improvements for synthetic data generation
Steven Glandon ... Myron E Hohil
-
Steven Glandon, et. al.Steven Glandon ... Myron E Hohil
06 Jun 2022
06 Jun 2022

The Use of Synthetic IMU Signals in the Training of Deep Learning Models Significantly Improves the Accuracy of Joint Kinematic Predictions.
Mohsen Sharifi Renani ... Chadd W Clary
Sensors | VOL. 21
Mohsen Sharifi Renani, et. al.Mohsen Sharifi Renani ... Chadd W Clary
31 Aug 2021
Sensors | VOL. 21

Synthetic data at scale: a development model to efficiently leverage machine learning in agriculture.
Jonathan Klein ... Dominik L Michels
Frontiers in plant science | VOL. 15
Jonathan Klein, et. al.Jonathan Klein ... Dominik L Michels
16 Sep 2024
Frontiers in plant science | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Annotating and Modeling Fine-grained Factuality in Summarization

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers