Multi-document Summarization Research Articles

Sentiment analysis concerns the study of opinions expressed in a text. This paper presents the QMOS method, which employs a combination of sentiment analysis and summarization approaches. It is a lexicon-based method to query-based multi-documents summarization of opinion expressed in reviews.QMOS combines multiple sentiment dictionaries to improve word coverage limit of the individual lexicon. A major problem for a dictionary-based approach is the semantic gap between the prior polarity of a word presented by a lexicon and the word polarity in a specific context. This is due to the fact that, the polarity of a word depends on the context in which it is being used. Furthermore, the type of a sentence can also affect the performance of a sentiment analysis approach. Therefore, to tackle the aforementioned challenges, QMOS integrates multiple strategies to adjust word prior sentiment orientation while also considers the type of sentence. QMOS also employs the Semantic Sentiment Approach to determine the sentiment score of a word if it is not included in a sentiment lexicon.On the other hand, the most of the existing methods fail to distinguish the meaning of a review sentence and user's query when both of them share the similar bag-of-words; hence there is often a conflict between the extracted opinionated sentences and users’ needs. However, the summarization phase of QMOS is able to avoid extracting a review sentence whose similarity with the user's query is high but whose meaning is different. The method also employs the greedy algorithm and query expansion approach to reduce redundancy and bridge the lexical gaps for similar contexts that are expressed using different wording, respectively. Our experiment shows that the QMOS method can significantly improve the performance and make QMOS comparable to other existing methods.

Read full abstract

Abstract: Multilingual Multi-Document Summarization aims at ranking the sentences of a cluster with (at least) 2 news texts (1 in the user’s language and 1 in a foreign language), and select the top-ranked sentences for a summary in the user’s language. We explored three concept-based statistics and one superficial strategy for sentence ranking. We used a bilingual corpus (Brazilian Portuguese-English) encoded in UNL (Universal Network Language) with source and summary sentences aligned based on content overlap. Our experiment shows that “concept frequency normalized by the number of concepts in the sentence” is the measure that best ranks the sentences selected by humans. However, it does not outperform the superficial strategy based on the position of the sentences in the texts. This indicates that the most frequent concepts are not always contained in first sentences, usually selected by humans to build the summaries because they convey the main information of the collection.Keywords: content selection; concept; statistical measure; multilingual corpus; multi-document summarization.Keywords: content selection; concept; statistical measure; multilingualcorpus; multi-document summarization.Resumo: O objetivo da Sumarização Automática Multilíngue Multidocumento é ranquear as sentenças de uma coleção com ao menos duas notícias (1 na língua do usuário e 1 em língua estrangeira) e selecionar as mais bem pontuadas para compor um sumário na língua do usuário. Exploramos três estatísticas conceituais e uma estratégia superficial para criar um ranque das sentenças quanto à relevância. Para tanto, utilizamos um corpus bilíngue (português-inglês) anotado via UNL (Universal Network Language) e com textos-fonte e sumários alinhados em nível sentencial. A avaliação indica que a estatísticadenominada frequência de conceitos normalizada pelo número de conceitos da sentença é a que melhor reproduz o ranqueamento humano. Essa medida, entretanto, não supera a estratégia superficial baseada na posição das sentenças. Isso indica que os conceitos mais frequentes do cluster nem sempre estão contidos nas primeiras sentenças dos textosfonte, usualmente selecionadas pelos humanos para compor os sumários porque veiculam a informação principal da coleção.Palavras-chave: seleção de conteúdo; conceito; medida estatística; corpus multilíngue; sumarização multidocumento.

Read full abstract

Multi-document Summarization Research Articles

Related Topics

Articles published on Multi-document Summarization

Unity in Diversity: Learning Distributed Heterogeneous Sentence Representation for Extractive Summarization

Sentence Extraction Based on Sentence Distribution and Part of Speech Tagging for Multi-Document Summarization

Sentence Extraction Based on Sentence Distribution and Part of Speech Tagging for Multi-Document Summarization

Calculating the Upper Bounds for Multi-Document Summarization using Genetic Algorithms

BHLM: Bayesian theory-based hybrid learning model for multi-document summarization

Extractive multi-document summarization using multilayer networks

Multi-documents summarization based on clustering of learning object using hierarchical clustering

Empirical Analysis of Single and Multi Document Summarization using Clustering Algorithms

Multi-Document Text Summarization Using Deep Learning Algorithm with Fuzzy Logic

Multi-Document Summarization Using K-Means and Latent Dirichlet Allocation (LDA) – Significance Sentences

Extractive Summarization: Limits, Compression, Generalized Model and Heuristics

QMOS: Query-based multi-documents opinion-oriented summarization

Exploring content selection strategies for Multilingual Multi-Document Summarization based on the Universal Network Language (UNL)

Extractive multi-document text summarization using a multi-objective artificial bee colony optimization approach

Multi-document summarization based on sentence cluster using non-negative matrix factorization

Event-based knowledge reconciliation using frame embeddings and frame similarity

Personalised health document summarisation exploiting Unified Medical Language System and topic-based clustering for mobile healthcare

TOPSIS with Multiple Linear Regression for Multi-Document Text Summarization

Text summarization from legal documents: a survey

Linguistic challenges in automatic summarization technology

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Multi-document Summarization Research Articles

Related Topics

Articles published on Multi-document Summarization

Unity in Diversity: Learning Distributed Heterogeneous Sentence Representation for Extractive Summarization

Sentence Extraction Based on Sentence Distribution and Part of Speech Tagging for Multi-Document Summarization

Sentence Extraction Based on Sentence Distribution and Part of Speech Tagging for Multi-Document Summarization

Calculating the Upper Bounds for Multi-Document Summarization using Genetic Algorithms

BHLM: Bayesian theory-based hybrid learning model for multi-document summarization

Extractive multi-document summarization using multilayer networks

Multi-documents summarization based on clustering of learning object using hierarchical clustering

Empirical Analysis of Single and Multi Document Summarization using Clustering Algorithms

Multi-Document Text Summarization Using Deep Learning Algorithm with Fuzzy Logic

Multi-Document Summarization Using K-Means and Latent Dirichlet Allocation (LDA) – Significance Sentences

Extractive Summarization: Limits, Compression, Generalized Model and Heuristics

QMOS: Query-based multi-documents opinion-oriented summarization

Exploring content selection strategies for Multilingual Multi-Document Summarization based on the Universal Network Language (UNL)

Extractive multi-document text summarization using a multi-objective artificial bee colony optimization approach

Multi-document summarization based on sentence cluster using non-negative matrix factorization

Event-based knowledge reconciliation using frame embeddings and frame similarity

Personalised health document summarisation exploiting Unified Medical Language System and topic-based clustering for mobile healthcare

TOPSIS with Multiple Linear Regression for Multi-Document Text Summarization

Text summarization from legal documents: a survey

Linguistic challenges in automatic summarization technology