Sentiment-oriented query-focused text summarization addressed with a multi-objective optimization approach

Jesus M Sanchez-Gomez,Miguel A Vega-Rodríguez,Carlos J Pérez

doi:10.1016/j.asoc.2021.107915

Jesus M Sanchez-Gomez, Miguel A Vega-Rodríguez + Show 1 more

Open Access

https://doi.org/10.1016/j.asoc.2021.107915

Copy DOI

Journal: Applied Soft Computing Journal	Publication Date: Sep 22, 2021
Citations: 7	License type: cc-by

Affiliation: University of Extremadura

Abstract

Nowadays, the automatic text summarization is a highly relevant task in many contexts. In particular, query-focused summarization consists of generating a summary from one or multiple documents according to a query given by the user. Additionally, sentiment analysis and opinion mining analyze the polarity of the opinions contained in texts. These two issues are integrated in an approach to produce an opinionated summary according to the user’s query. Thereby, the query-focused sentiment-oriented extractive multi-document text summarization problem entails the optimization of different criteria, specifically, query relevance, redundancy reduction, and sentiment relevance. An adaptation of the metaheuristic population-based crow search algorithm has been designed, implemented, and tested to solve this multi-objective problem. Experiments have been carried out by using datasets from the Text Analysis Conference (TAC) datasets. Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics and the Pearson correlation coefficient have been used for the performance assessment. The results have reported that the proposed approach outperforms the existing methods in the scientific literature, with a percentage improvement of 75.5% for ROUGE-1 score and 441.3% for ROUGE-2 score. It also has been obtained a Pearson correlation coefficient of +0.841, reporting a strong linear positive correlation between the sentiment scores of the generated summaries and the sentiment scores of the queries of the topics.

Highlights

Nowadays, the size of digital information on the Internet is huge, and it follows growing
Text Analysis Conference (TAC) is a series of annual conferences focused on applications of Natural Language Processing, providing large data collections for testing
The query-focused extractive multi-document text summarization task consists of generating a summary automatically according to a determined user information, that is given as a query

Summary

Introduction

The size of digital information on the Internet is huge, and it follows growing. The Internet users are characterized by wanting to obtain specific information about a determined topic as quickly as possible, but the large volume of existing digital information complicates to carry out this task. The study of the users’ opinion about news, political and social events, products preferences, and marketing campaigns, among other topics, is another aspect that is currently gaining great relevance in many fields. In this respect, the area of sentiment analysis and opinion mining deals with the computational treatment of opinion and sentiment in order to analyze the polarity and the feelings shown in digital texts [2]

Methods

Results

Conclusion