Abstract
Abstractive summarization models can generate summary auto-regressively, but the quality is often impacted by the noise in the text. Learning cross-sentence relations is a crucial step in this task and the graph-based network is more effective to capture the sentence relationship. Moreover, knowledge is very important to distinguish the noise of the text in special domain. A novel model structure called UGDAS is proposed in this paper, which combines a sentence-level denoiser based on an unsupervised graph-network and an auto-regressive generator. It utilizes domain knowledge and sentence position information to denoise the original text and further improve the quality of generated summaries. We use the recently-introduced dataset CORD-19 (COVID-19 Open Research Dataset) on text summarization task, which contains large-scale data on coronaviruses. The experimental results show that our model achieves the SOTA (state-of-the-art) result on CORD-19 dataset and outperforms the related baseline models on the PubMed Abstract dataset.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.