Stylometric analysis of classical Arabic texts for genre detection

Maha Al-Yahya

doi:10.1108/el-11-2017-0236

Abstract

PurposeIn the context of information retrieval, text genre is as important as its content, and knowledge of the text genre enhances the search engine features by providing customized retrieval. The purpose of this study is to explore and evaluate the use of stylometric analysis, a quantitative analysis for the linguistics features of text, to support the task of automated text genre detection for Classical Arabic text.Design/methodology/approachUnsupervised clustering and supervised classification were applied on the King Saud University Corpus of Classical Arabic texts (KSUCCA) using the most frequent words in the corpus (MFWs) as stylometric features. Four popular distance measures established in stylometric research are evaluated for the genre detection task.FindingsThe results of the experiments show that stylometry-based genre clustering and classification align well with human-defined genre. The evidence suggests that genre style signals exist for Classical Arabic and can be used to support the task of automated genre detection.Originality/valueThis work targets the task of genre detection in Classical Arabic text using stylometric features, an approach that has only been previously applied to Arabic authorship attribution. The study also provides a comparison of four distance measures used in stylomtreic analysis on the KSUCCA, a corpus with over 50 million words of Classical Arabic using clustering and classification.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Stylometric analysis of classical Arabic texts for genre detection

Abstract

Talk to us

Similar Papers

More From: The Electronic Library

Lead the way for us

Journal: The Electronic Library	Publication Date: Oct 4, 2018
Citations: 10

Similar Papers

Genre and author detection in Turkish texts using artificial immune recognition systems
Zafer Kaban ... Banu Diri
-
Zafer Kaban, et. al.Zafer Kaban ... Banu Diri
01 Apr 2008
01 Apr 2008

Towards Automated Fiqh School Authorship Attribution
Maha Al-Yahya
-
Maha Al-YahyaMaha Al-Yahya
01 Jan 2023
01 Jan 2023

Distance measures in author profiling
Mirco Kocher ... Jacques Savoy
Information Processing & Management | VOL. 53
Mirco Kocher, et. al.Mirco Kocher ... Jacques Savoy
04 May 2017
Information Processing & Management | VOL. 53

Influence of features discretization on accuracy of random forest classifier for web user identification
Alisa A Vorobeva
-
Alisa A VorobevaAlisa A Vorobeva
01 Apr 2017
01 Apr 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Stylometric analysis of classical Arabic texts for genre detection

Abstract

Talk to us

Similar Papers

More From: The Electronic Library