Comparative Analysis and Evaluation of Stemming and Preprocessing Techniques for Arabic Text

Samer Mohammed Yaseen,Abdualmajed A G Al-Khulaidi

doi:10.59628/jast.v1i4.588

Abstract

Arabic information retrieval is challenging due to the language's complex morphology and syntax. Preprocessing and stemming improve the accuracy and efficiency of Arabic information retrieval. This paper provides a comprehensive analysis of the existing literature on Arabic preprocessing and stemming techniques. The paper identifies the limitations and challenges of these techniques. The paper emphasizes the importance of preprocessing and stemming and underscores the need for further research to improve Arabic information retrieval. This study evaluates ten stemmers on a public dataset. The results show that root-based stemmers: Lucene, and khoja got the highest reduction rate 90.9%, and 85% respectively. The results emphasize that root-based stemmers have good conflating ability for similar terms, while light-based stemmers under-stem similar terms.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Comparative Analysis and Evaluation of Stemming and Preprocessing Techniques for Arabic Text

Abstract

Talk to us

Similar Papers

More From: مجلة جامعة صنعاء للعلوم التطبيقية والتكنولوجيا

Lead the way for us

Journal: مجلة جامعة صنعاء للعلوم التطبيقية والتكنولوجيا	Publication Date: Dec 21, 2023
License type: CC BY-NC-ND 4.0

Similar Papers

Arabic Information Retrieval
Kareem Darwish
-
Kareem DarwishKareem Darwish
01 Jan 2014
01 Jan 2014

Arabic Information Retrieval
Walid Magdy ... Kareem Darwish
Foundations and Trends® in Information Retrieval | VOL. 7
Walid Magdy, et. al.Walid Magdy ... Kareem Darwish
01 Jan 2014
Foundations and Trends® in Information Retrieval | VOL. 7

Should one use term proximity or multi-word terms for Arabic information retrieval?
Abdelkader El Mahdaouy ... Saïd Ouatik El Alaoui
Computer Speech & Language | VOL. 58
Abdelkader El Mahdaouy, et. al.Abdelkader El Mahdaouy ... Saïd Ouatik El Alaoui
12 Apr 2019
Computer Speech & Language | VOL. 58

Impacts of Multimodal Feedback on Efficiency of Proactive Information Retrieval from Task-Related HRI
Barbara Gonsior ... Christian Landsiedel
Journal of Advanced Computational Intelligence and Intelligent Informatics | VOL. 16
Barbara Gonsior, et. al.Barbara Gonsior ... Christian Landsiedel
20 Mar 2012
Journal of Advanced Computational Intelligence and Intelligent Informatics | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparative Analysis and Evaluation of Stemming and Preprocessing Techniques for Arabic Text

Abstract

Talk to us

Similar Papers

More From: مجلة جامعة صنعاء للعلوم التطبيقية والتكنولوجيا