Abstract

Text summarisation is one of the interesting tools for a quick and optimal exploitation of the huge amount of online textual documents. Several approaches have been proposed to date to produce extractive summaries in Arabic. However, in most cases, the linguistic qualities of the generated summary are not satisfactory. In this paper, we attempt to overcome this limitation by proposing a new approach for single-document summarisation that combines a discourse analysis following the rhetorical structure theory (RST) framework and a score-based method. Unlike traditional RST-based approaches, the proposed approach relies on exploiting intra-sentence discourse relations instead of text discourse structure to produce a primary summary. Then, each sentence within the primary summary is evaluated based on a combination of statistical and linguistic features to produce the final summary considering user compression rate. The proposed approach was evaluated under Essex Arabic Summaries Corpus (EASC) using ROUGE-1 and ROUGE-2 measures, and compared against other existing methods. A human evaluation was also conducted in order to assess the linguistic qualities of generated summaries. Experimental results are very encouraging and prove that, exploiting discourse relations is very useful to produce Arabic extractive summaries with good linguistic qualities.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.