The impact of term-weighting schemes and similarity measures on extractive multi-document text summarization

Jesus M Sanchez-Gomez,Miguel A Vega-Rodríguez,Carlos J Pérez

doi:10.1016/j.eswa.2020.114510

Jesus M Sanchez-Gomez, Miguel A Vega-Rodríguez + Show 1 more

https://doi.org/10.1016/j.eswa.2020.114510

Copy DOI

Abstract

Automatic text summarization is currently a topic of great interest in many knowledge fields. Extractive multi-document text summarization methods aim to reduce the textual information from a document collection by covering the main content and reducing the redundant information. In the scientific literature, there are different approaches related to term-weighting schemes and similarity measures, which are necessary for implementing an automatic summary system. However, to the best of the authors’ knowledge, there are no studies to analyze the performance of the different schemes and measures. In this paper, all possible combinations of the most common term-weighting schemes and similarity measures used in the extractive multi-document text summarization field have been implemented, compared, and analyzed. Experiments have been performed with Document Understanding Conferences (DUC) datasets, and the model performance has been assessed with eight Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics and the execution time. Results show that the best term-weighting scheme is the term-frequency inverse-sentence-frequency scheme, and the best similarity measure is the cosine similarity. Even more, the combination formed by both of them has obtained the best average results in 87.5% of ROUGE scores compared to the other combinations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

The impact of term-weighting schemes and similarity measures on extractive multi-document text summarization

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications

Lead the way for us

Journal: Expert Systems with Applications	Publication Date: Dec 24, 2020
Citations: 12

Similar Papers

Extractive Multi-document Text Summarization Leveraging Hybrid Semantic Similarity Measures
Rajesh Bandaru ... Y Radhika
International Journal of Advanced Computer Science and Applications | VOL. 13
Rajesh Bandaru, et. al.Rajesh Bandaru ... Y Radhika
01 Jan 2021
International Journal of Advanced Computer Science and Applications | VOL. 13

A decomposition-based multi-objective optimization approach for extractive multi-document text summarization
Jesus M Sanchez-Gomez ... Carlos J Pérez
Applied Soft Computing | VOL. 91
Jesus M Sanchez-Gomez, et. al.Jesus M Sanchez-Gomez ... Carlos J Pérez
16 Mar 2020
Applied Soft Computing | VOL. 91

Experimental analysis of multiple criteria for extractive multi-document text summarization
Jesus M Sanchez-Gomez ... Carlos J Pérez
Expert Systems With Applications | VOL. 140
Jesus M Sanchez-Gomez, et. al.Jesus M Sanchez-Gomez ... Carlos J Pérez
29 Aug 2019
Expert Systems With Applications | VOL. 140

An Indicator-based Multi-Objective Optimization Approach Applied to Extractive Multi-Document Text Summarization
Jesus M Sanchez-Gomez ... Miguel A Vega-Rodríguez
IEEE Latin America Transactions | VOL. 17
Jesus M Sanchez-Gomez, et. al.Jesus M Sanchez-Gomez ... Miguel A Vega-Rodríguez
01 Aug 2019
IEEE Latin America Transactions | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The impact of term-weighting schemes and similarity measures on extractive multi-document text summarization

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications