Abstract

Abstractive summarization models can generate summary auto-regressively, but the quality is often impacted by the noise in the text. Learning cross-sentence relations is a crucial step in this task and the graph-based network is more effective to capture the sentence relationship. Moreover, knowledge is very important to distinguish the noise of the text in special domain. A novel model structure called UGDAS is proposed in this paper, which combines a sentence-level denoiser based on an unsupervised graph-network and an auto-regressive generator. It utilizes domain knowledge and sentence position information to denoise the original text and further improve the quality of generated summaries. We use the recently-introduced dataset CORD-19 (COVID-19 Open Research Dataset) on text summarization task, which contains large-scale data on coronaviruses. The experimental results show that our model achieves the SOTA (state-of-the-art) result on CORD-19 dataset and outperforms the related baseline models on the PubMed Abstract dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call