A Comparative Survey of Authorship Attribution on Short Arabic Texts

Siham Ouamour,Halim Sayoud

doi:10.1007/978-3-319-99579-3_50

Abstract

In this paper, we deal with the problem of authorship attribution (AA) on short Arabic texts. So, we make a survey on a set of several features and classifiers that are employed for the task of AA. This investigation uses characters, character bigrams, character trigrams, character tetragrams, words, word bigrams and rare words. The AA is ensured by 4 different measures, 3 classifiers (Multi-Layer Perceptron (MLP), Support Vector Machines (SVM) and Linear Regression (LR)) and a new proposed fusion called VBF (i.e. Vote Based Fusion). The evaluation is done on short Arabic texts extracted from the AAAT dataset (AA of Ancient Arabic Texts). Although the task of AA is known to be difficult on short texts, the different results have revealed interesting information on the performances of the features and classification techniques on Arabic text data. For instance, character-based features appear to be better than word-based features for short texts. Furthermore, the proposed VBF fusion provided high performances with an accuracy of 90% of good AA, which is higher than the score of the original classifier using only one feature. Globally, the results of this investigation shed light on the efficiency and pertinency of several features and classifiers in AA of short Arabic texts.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Comparative Survey of Authorship Attribution on Short Arabic Texts

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Authorship Attribution of Short Historical Arabic Texts Based on Lexical Features
S Ouamour ... H Sayoud
-
S Ouamour, et. al.S Ouamour ... H Sayoud
01 Oct 2013
01 Oct 2013

Authorship Attribution of Short Historical Arabic Texts using Stylometric Features and a KNN Classifier with Limited Training Data
Fatma Howedi ... Zahra Aborawi Aborawi
Journal of Computer Science | VOL. 16
Fatma Howedi, et. al.Fatma Howedi ... Zahra Aborawi Aborawi
01 Oct 2020
Journal of Computer Science | VOL. 16

Authorship attribution of ancient texts written by ten Arabic travelers using character N-Grams
Siham Ouamour ... Halim Sayoud
-
Siham Ouamour, et. al.Siham Ouamour ... Halim Sayoud
01 May 2013
01 May 2013

Leveraging Knowledge-Based Features With Multilevel Attention Mechanisms for Short Arabic Text Classification
Iyad Alagha
IEEE Access | VOL. 10
Iyad AlaghaIyad Alagha
01 Jan 2021
IEEE Access | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Comparative Survey of Authorship Attribution on Short Arabic Texts

Abstract

Talk to us

Similar Papers