Abstract

The aim of automatic multi-document abstractive summarization is to create a compressed version of the source text and preserves the salient information. Existing graph based summarization methods treat sentence as bag of words, rely on content similarity measure and did not consider semantic relationships between sentences. These methods may fail in determining redundant sentences that are semantically equivalent. This paper introduces a genetic semantic graph based approach for multi-document abstractive summarization. Semantic graph from the document set is constructed in such a way that the graph nodes represent the predicate argument structures (PASs), extracted automatically by employing semantic role labeling (SRL); and the edges of graph correspond to semantic similarity weight determined from PAS-to-PAS semantic similarity, and PAS-to-document set relationship. The PAS-to-document set relationship is represented by different features, weighted and optimized by genetic algorithm. The salient graph nodes (PASs) are ranked based on modified graph based ranking algorithm. In order to reduce redundancy, we utilize maximal marginal relevance (MMR) to re-ranks the PASs and use language generation to generate summary sentences from the top ranked PASs. Experiment of this study is carried out using DUC-2002, a standard corpus for text summarization. Experimental results reveal that the proposed approach performs better than other summarization systems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call