Multi-document Summarization Research Articles

Automatic text summarisation is obtaining a subset that accurately represents the main text. A quality summary should contain the maximum amount of information while avoiding redundant information. Redundancy is a severe deficiency that causes unnecessary repetition of information within sentences and should not occur in summarisation studies. Although many optimisation-based text summarisation methods have been proposed in recent years, there exists a lack of research on the simultaneous optimisation of scope and redundancy. In this context, this study presents an approach in which maximum coverage and minimum redundancy, which form the two key features of a rich summary, are modelled as optimisation targets. In optimisation-based text summarisation studies, different conflicting objectives are generally weighted or formulated and transformed into single-objective problems. However, this transformation can directly affect the quality of the solution. In this study, the optimisation goals are met simultaneously without transformation or formulation. In addition, the multi-objective saplings growing-up algorithm (MO-SGuA) is implemented and modified for text summarisation. The presented approach, called Pareto optimal, achieves an optimal solution with simultaneous optimisation. Experimentation with the MO-SGuA method was tested using open-access (document understanding conference; DUC) data sets. Performance success of the MO-SGuA approach was calculated using the recall-oriented understudy for gisting evaluation (ROUGE) metrics and then compared with the competitive practices used in the literature. Testing achieved a 26.6% summarisation result for the ROUGE-2 metric and 65.96% for ROUGE-L, which represents an improvement of 11.17% and 20.54%, respectively. The experimental results showed that good-quality summaries were achieved using the proposed approach.

In recent times, text summarization has gained enormous attention from the research community. Among the many uses of natural language processing, text summarization has emerged as a critical component in information retrieval. In particular, within the past two decades, many attempts have been undertaken by researchers to provide robust, useful summaries of their findings. Text summarizing may be described as automatically constructing a summary version of a given document while keeping the most important information included within the content itself. This method also aids users in quickly grasping the fundamental notions of information sources. The current trend in text summarizing, on the other hand, is increasingly focused on the area of news summaries. The first work in summarizing was done using a single-document summary as a starting point. The summarizing of a single document generates a summary of a single paper. As research advanced, mainly due to the vast quantity of information available on the internet, the concept of multidocument summarization evolved. Multidocument summarization generates summaries from a large number of source papers that are all about the same subject or are about the same event. Because of the content duplication, the news summarization system, on the other hand, is unable to cope with multidocument news summarizations well. Using the Naive Bayes classifier for classification, news websites were distinguished from nonnews web pages by extracting content, structure, and URL characteristics. The classifier was then used to differentiate between the two groups. A comparison is also made between the Naive Bayes classifier and the SMO and J48 classifiers for the same dataset. The findings demonstrate that it performs much better than the other two. After those important contents have been extracted from the correctly classified newscast web pages. Then, extracted relevant content is used for the keyphrase extraction from the news articles. Keyphrases can be a single word or a combination of more than one word representing the news article’s significant concept. Our proposed approach of crucial phrase extraction is based on identifying candidate phrases from the news articles and choosing the highest weight candidate phrase using the weight formula. Weight formula includes features such as TFIDF, phrase position, and construction of lexical chain to represent the semantic relations between words using WordNet. The proposed approach shows promising results compared to the other existing techniques.

Multi-document Summarization Research Articles

Related Topics

Articles published on Multi-document Summarization

Evaluating a typology of signals for automatic detection of complementarity

VLSP 2022 Abmusu Task Dataset: A Resource for Vietnamese Abstractive Multi-Document Summarization

KEST: A graph-based keyphrase extraction technique for tweets summarization using Markov Decision Process

Two-phase Multi-document Event Summarization on Core Event Graphs

Multi-document summarization for patent documents based on generative adversarial network

What have you read? based Multi-Document Summarization

A new multi-document summarisation approach using saplings growing-up optimisation algorithms: Simultaneously optimised coverage and diversity

Web-Based News Straining and Summarization Using Machine Learning Enabled Communication Techniques for Large-Scale 5G Networks

Generating extractive sentiment summaries for natural language user queries on products

Application of K -Means Clustering Algorithm in Energy Data Analysis

Hybrid multi-document text summarization via categorization based on BERT deep learning models

Query-focused multi-document text summarization using fuzzy inference

Extractive summarization of Malayalam documents using latent Dirichlet allocation: An experience

A multi-objective memetic algorithm for query-oriented text summarization: Medicine texts as a case study

A Survey on Multi-document Summarization and Domain-Oriented Approaches

Feature-based POS tagging and sentence relevance for news multi-document summarization in Bahasa Indonesia

A developed framework for multi-document summarization using softmax regression and spider monkey optimization methods

Extractive Multi-document Text Summarization Leveraging Hybrid Semantic Similarity Measures

Feature based Entailment Recognition for Malayalam Language Texts

Template-Based Headline Generator for Multiple Documents

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Multi-document Summarization Research Articles

Related Topics

Articles published on Multi-document Summarization

Evaluating a typology of signals for automatic detection of complementarity

VLSP 2022 Abmusu Task Dataset: A Resource for Vietnamese Abstractive Multi-Document Summarization

KEST: A graph-based keyphrase extraction technique for tweets summarization using Markov Decision Process

Two-phase Multi-document Event Summarization on Core Event Graphs

Multi-document summarization for patent documents based on generative adversarial network

What have you read? based Multi-Document Summarization

A new multi-document summarisation approach using saplings growing-up optimisation algorithms: Simultaneously optimised coverage and diversity

Web-Based News Straining and Summarization Using Machine Learning Enabled Communication Techniques for Large-Scale 5G Networks

Generating extractive sentiment summaries for natural language user queries on products

Application of K -Means Clustering Algorithm in Energy Data Analysis

Hybrid multi-document text summarization via categorization based on BERT deep learning models

Query-focused multi-document text summarization using fuzzy inference

Extractive summarization of Malayalam documents using latent Dirichlet allocation: An experience

A multi-objective memetic algorithm for query-oriented text summarization: Medicine texts as a case study

A Survey on Multi-document Summarization and Domain-Oriented Approaches

Feature-based POS tagging and sentence relevance for news multi-document summarization in Bahasa Indonesia

A developed framework for multi-document summarization using softmax regression and spider monkey optimization methods

Extractive Multi-document Text Summarization Leveraging Hybrid Semantic Similarity Measures

Feature based Entailment Recognition for Malayalam Language Texts

Template-Based Headline Generator for Multiple Documents