Abstract

We present in this paper an automatic summarization technique of Arabic texts, based on RST. We first present a corpus study which enabled us to specify, following empirical observations, a set of relations and rhetorical frames. Then, we present our method to automatically summarize Arabic texts. Finally, we present the architecture of the ARSTResume system. This method is based on the Rhetorical Structure Theory (Mann, 1988) and uses linguistic knowledge. The method relies on three pillars. The first consists in locating the rhetorical relations between the minimal units of the text by applying the rhetorical rules. One of these units is the nucleus (the segment necessary to maintain coherence) and the other can be either nucleus or satellite (an optional segment). The second pillar is the representation and the simplification of the RST-tree that represents the entries text in hierarchical form. The third pillar is the selection of sentences for the final summary, which takes into account the type of the rhetorical relations chosen for the extract.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call