Ikhtasir &amp;#x2014; A user selected compression ratio Arabic text summarization system

Aqil Azmi,Suha Al-Thanyyan

doi:10.1109/nlpke.2009.5313732

Abstract

Automatic text summarization is an active research field. The rapid growth of the Web, and the associated information overloading, has injected new life into this research area. In certain languages there has been plenty of research in automatic text summarization. Arabic is not one of them. In this paper we present an automatic extractive Arabic text summarization system where the user can cap the size of the summary. The system does not require learning and employs rhetorical structure theory (RST) along with a sentence scoring scheme, where individual sentences are scored. For output, sentences are selected with an objective of maximizing the overall score of the summary whose size is within the user selected compression ratio. For evaluation, system generated summaries of various lengths were compared against those performed by a professional human. Experiments on sample texts show our system outperforms some of the other existing systems including those that require learning.

Full Text