Text Summarization by Sentence Extraction Using Unsupervised Learning

René Arnulfo García-Hernández,Romyna Montiel,Alexander Gelbukh,Eréndira Rendón,Yulia Ledeneva,Rafael Cruz

doi:10.1007/978-3-540-88636-5_12

Abstract

The main problem for generating an extractive automatic text summary is to detect the most relevant information in the source document. Although, some approaches claim being domain and language independent, they use high dependence knowledge like key-phrases or golden samples for machine-learning approaches. In this work, we propose a language- and domain-independent automatic text summarization approach by sentence extraction using an unsupervised learning algorithm. Our hypothesis is that an unsupervised algorithm can help for clustering similar ideas (sentences). Then, for composing the summary, the most representative sentence is selected from each cluster. Several experiments in the standard DUC-2002 collection show that the proposed method obtains more favorable results than other approaches.

Full Text