Document Summarization Based on Coverage with Noise Injection and Word Association

Heechan Kim,Soowon Lee

doi:10.3390/info11110536

Heechan Kim, Soowon Lee

Open Access

https://doi.org/10.3390/info11110536

Copy DOI

Journal: Information	Publication Date: Nov 19, 2020
Citations: 1	License type: CC BY 4.0

Affiliation: Soongsil University

Abstract

Automatic document summarization is a field of natural language processing that is rapidly improving with the development of end-to-end deep learning models. In this paper, we propose a novel summarization model that consists of three methods. The first is a coverage method based on noise injection that makes the attention mechanism select only important words by defining previous context information as noise. This alleviates the problem that the summarization model generates the same word sequence repeatedly. The second is a word association method to update the information of each word by comparing the information of the current step with the information of all previous decoding steps. According to following words, this catches a change in the meaning of the word that has been already decoded. The third is a method using a suppression loss function that explicitly minimizes the probabilities of non-answer words. The proposed summarization model showed good performance on some recall-oriented understudy for gisting evaluation (ROUGE) metrics compared to the state-of-the-art models in the CNN/Daily Mail summarization task, and the results were achieved with very few learning steps compared to the state-of-the-art models.

Highlights

Automatic document summarization is a research field that extracts important information from documents in natural language processing [1]
We propose a coverage method based on noise injection, in which noise refers to adaptive noise that changes according to the context information rather than a random variable and the coverage is defined based on the context and the noise
To solve the problems in previous research on automatic summarization, we proposed a coverage method based on noise injection, a word association method, and a suppression loss function that utilizes misclassification information as a penalty

Summary

Introduction

Automatic document summarization is a research field that extracts important information from documents in natural language processing [1]. As the volume of text data is rapidly increasing, the importance of summarization research is increasing, with the need for only important information to be extracted. Automatic summarization can be divided into abstract summarization and extractive summarization based on how the summary is generated. Abstractive summarization constructs a summary by generating a sequence of important words related to an input document. Extractive summarization constructs a summary by measuring saliences of sentences or words in an input document and selecting the sentences or words having the highest salience.

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Document Summarization Based on Coverage with Noise Injection and Word Association

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information

Lead the way for us

Similar Papers

A topic modeled unsupervised approach to single document extractive text summarization
Ridam Srivastava ... Vineet Kumar
Knowledge-Based Systems | VOL. 246
Ridam Srivastava, et. al.Ridam Srivastava ... Vineet Kumar
01 Apr 2022
Knowledge-Based Systems | VOL. 246

Decomposition–based multi-objective differential evolution for extractive multi-document automatic text summarization
Muhammad Hafizul Hazmi Wahab ... Mohamed Othman
Applied Soft Computing | VOL. 151
Muhammad Hafizul Hazmi Wahab, et. al.Muhammad Hafizul Hazmi Wahab ... Mohamed Othman
31 Oct 2023
Applied Soft Computing | VOL. 151

Reference and Document Aware Semantic Evaluation Methods for Korean Language Summarization
Dongyub Lee ... Jaechoon Jo
-
Dongyub Lee, et. al.Dongyub Lee ... Jaechoon Jo
01 Jan 2020
01 Jan 2020

Automatic Text Summarization for Hindi Using Real Coded Genetic Algorithm
Arti Jain ... Anuja Arora
Applied Sciences | VOL. 12
Arti Jain, et. al.Arti Jain ... Anuja Arora
29 Jun 2022
Applied Sciences | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Document Summarization Based on Coverage with Noise Injection and Word Association

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information