Abstract
Internet or Web consists of a massive amount of information, handling which is a tedious task. Summarization plays a crucial role in extracting or abstracting key content from multiple sources with its meaning contained, thereby reducing the complexity in handling the information. Multi-document summarization gives the gist of the content collected from multiple documents. Temporal summarization concentrates on temporally related events. This paper proposes a Multi-Document Temporal Summarization (MDTS) technique that generates the summary based on temporally related events extracted from multiple documents. This technique extracts the events with the time stamp. TIMEML standards tags are used in extracting events and times. These event-times are stored in a structured database form for easier operations. Sentence ranking methods are build based on the frequency of events occurrences in the sentence. Sentence similarity measures are computed to eliminate the redundant sentences in an extracted summary. Depending on the required summary length, top-ranked sentences are selected to form the summary. Experiments are conducted on DUC 2006 and DUC 2007 data set that was released for multi-document summarization task. The extracted summaries are evaluated using ROUGE to determine precision, recall and F measure of generated summaries. The performance of the proposed method is compared with particle swarm optimization-based algorithm (PSOS), Cat swarm optimization-based summarization (CSOS), Cuckoo Search based multi-document summarization (MDSCSA). It is found that the performance of MDTS is better when compared with other methods. Doi: 10.28991/esj-2021-01268 Full Text: PDF
Highlights
The information is overloaded with redundancy, and enormous types of content spread unevenly over the documents
This paper presents a novel framework for temporal summarization on multiple documents
Experiments are conducted on the Document Conference Understanding (DUC) DUC 2006 and DUC 2007 dataset
Summary
The information is overloaded with redundancy, and enormous types of content spread unevenly over the documents. Summarization is abstracting or extracting key content from one or more information sources. This process generally identifies the vital information in a document or set of related documents and presents in condensed form. The meaning of vital information and redundancy differs from user to user [2]. Temporal summarization includes efficient monitoring and handling of information associated with an event over time. Summarizing the information from these multiple sources may contain the redundant data into summary. The need raises to build a Multi-Document Temporal Summarization (MDTS) which generates the summary with vital and non-redundant information
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.