Abstract

Automatic abstractive summary generation is still an open problem in natural language processing field. Conventional encoder–decoder model based abstractive summarization methods often suffer from repetition and semantic irrelevance. Recent studies apply traditional attention or graph-based attention on the encoder–decoder model to tackle the problem, under the assumption that all the sentences in the original document are indistinguishable from each other. But in a document, the same words in different sentences are not equally important, i.e., the words in a trivial sentence are less important than the words in a salient sentence. Based on it, we develop a HITS-based attention mechanism in this paper, which fully leverages sentence-level and word-level information by considering sentences and words in the original document as authorities and hubs. Based on it, we present a novel abstractive summarization method, with Kullback–Leibler (KL) divergence to refine the attention value, meanwhile we propose a comparison mechanism in summary generation to further improve the summarization performance. When evaluated on the CNN/Daily Mail and NYT datasets, the experimental results demonstrate the improvement of summarization performance and show the performance of our proposed method is comparable with that of the other summarization methods. Besides, we also conduct experiments on CORD-19 dataset (COVID-19 Open Research Dataset) which is a biomedical domain dataset, and the experimental results show great performance of our proposed method compared with that of the other state-of-the-art summarization methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.