A new graph based text segmentation using Wikipedia for automatic text summarization

Mohsen Pourvali,Ph.D Mohammad

doi:10.14569/ijacsa.2012.030105

Abstract

The technology of automatic document summarization is maturing and may provide a solution to the information overload problem. Nowadays, document summarization plays an important role in information retrieval. With a large volume of documents, presenting the user with a summary of each document greatly facilitates the task of finding the desired documents. Document summarization is a process of automatically creating a compressed version of a given document that provides useful information to users, and multi-document summarization is to produce a summary delivering the majority of information content from a set of documents about an explicit or implicit main topic. According to the input text, in this paper we use the knowledge base of Wikipedia and the words of the main text to create independent graphs. We will then determine the important of graphs. Then we are specified importance of graph and sentences that have topics with high importance. Finally, we extract sentences with high importance. The experimental results on an open benchmark datasets from DUC01 and DUC02 show that our proposed approach can improve the performance compared to state-of-the-art summarization approaches.

Highlights

The technology of automatic document summarization is maturing and may provide a solution to the information overload problem
Document summarization plays an important role in information retrieval (IR)
Text summarization is the process of automatically creating a compressed version of a given text that provides useful information to users, and multi-document summarization is to produce a summary delivering the majority of information content from a set of documents about an explicit or implicit main topic [14]

Summary

INTRODUCTION

The technology of automatic document summarization is maturing and may provide a solution to the information overload problem. Text summarization is the process of automatically creating a compressed version of a given text that provides useful information to users, and multi-document summarization is to produce a summary delivering the majority of information content from a set of documents about an explicit or implicit main topic [14]. Sentence based extractive summarization techniques are commonly used in automatic summarization to produce extractive summaries. Systems for extractive summarization are typically based on technique for sentence extraction, and attempt to identify the set of sentences that are most important for the overall understanding of a given document. In paper [11] proposed paragraph extraction from a document based on intra-document links between paragraphs It yields a text relationship map (TRM) from intra-links, which indicate that the linked texts are semantically related. The experimental results on an open benchmark datasets from DUC01 and DUC02 show that our proposed approach can improve the performance compared to state-of-the-art summarization approaches

RELATED WORK

CREATE GRAPH AND TEXT SEGMENTATION

SENTENCE EXTRACTION

Datasets

Evaluation metrics

Simulation strategy and parameters

Performance evaluation and discussion

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Advanced Computer Science and Applications	Publication Date: Jan 1, 2012
Citations: 14	License type: cc-by

R Discovery Prime

R Discovery Prime

A new graph based text segmentation using Wikipedia for automatic text summarization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications

Lead the way for us

Similar Papers

A new sentence similarity measure and sentence based extractive technique for automatic text summarization
Ramiz M Aliguliyev
Expert Systems with Applications | VOL. 36
Ramiz M AliguliyevRamiz M Aliguliyev
28 Nov 2008
Expert Systems with Applications | VOL. 36

Web-Based News Straining and Summarization Using Machine Learning Enabled Communication Techniques for Large-Scale 5G Networks
Amita Arora ... Manvi Siwach
Wireless Communications and Mobile Computing | VOL. 2022
Amita Arora, et. al.Amita Arora ... Manvi Siwach
23 Jun 2022
Wireless Communications and Mobile Computing | VOL. 2022

Introduction to the Special Issue on Summarization
Dragomir R Radev ... Eduard Hovy
Computational Linguistics | VOL. 28
Dragomir R Radev, et. al.Dragomir R Radev ... Eduard Hovy
01 Dec 2002
Computational Linguistics | VOL. 28

Survey on Graph and Cluster Based approaches in Multi-document Text Summarization
Yogesh Kumar Meena ... Ashish Jain
-
Yogesh Kumar Meena, et. al.Yogesh Kumar Meena ... Ashish Jain
01 May 2014
01 May 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A new graph based text segmentation using Wikipedia for automatic text summarization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications