Abstract

Problem Statement: Processing texts based on rhetorical structure theory has shown interesting results. Rhetorical Structure Theory (RST) improves the ability of extracting the semantic behind the processed text. Different applications such as information retrieval, text summarization, and text generation have proved to give better result using RST. The applicability of RST to process and understand texts has been studied in several languages, but little is devoted to the Arabic language. Given an Arabic text, the more accurate the Arabic rhetorical relations are extracted the more useful the subsequent text representation will be. This, in turn, leads to a better understanding of the text and, hence, better results. Approach: We show a framework of applying RST on Arabic language in order to rhetorically parse, understand, and summarize Arabic texts. We discuss a new approach that extracts the Arabic rhetorical relations that is based on studying the English relations, analyzing Arabic corpus and understanding and using the Arabic cue phrases. Results: We obtain rhetorical relations based on Arabic cues. We show how this approach contributes in improving the understanding of the Arabic text. The study addresses the relations that rise from cues that act as connectors among Arabic clauses as well as words. Conclusion: The introduced approach suggests that realizing text coherency in the process of obtaining Arabic rhetorical relations suits the characteristics of the Arabic language and avoids the disadvantages of previous approaches. The obtained Arabic rhetorical relations will make it possible to build rhetorical trees for Arabic texts to apply in text summarization and generation, information retrieval, and text segmentation while preserving the coherency of the text.

Highlights

  • Rhetorical Structure Theory (RST)[11] was introduced to serve as a discourse structure in the computational linguistic field

  • During the process of the rhetorical analysis, the elementary units that participate in building the rhetorical schema are determined and the rhetorical relations that hold among these units are determined to connect related spans

  • We showed a framework of applying RST on Arabic language in order to rhetorically parse and understand the Arabic texts

Read more

Summary

Introduction

Rhetorical Structure Theory (RST)[11] was introduced to serve as a discourse structure in the computational linguistic field. Rhetorical relations can be described functionally in terms of the writer purposes and the writer assumptions about the reader These rhetorical relations hold between adjacent and non-adjacent spans of texts. The output of applying the rhetorical structure theory to a text is a tree structure that organizes the text based on the rhetorical relations[14]. This structure is called the rhetorical schema. Determining the potential relations that connects related spans could be done using several techniques[2,5,6] One of such techniques is through the use of cue phrases[16]. Cue phrases have been used in various application including text segmentation[9,10,18] and text summarization[7]

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call