Comparative Analysis of Arabic Stemming Algorithms

Mohammed A Otair

doi:10.5121/ijmit.2013.5201

Abstract

In the context of Information Retrieval, Arabic stemming algorithms have become a most research area of information retrieval. Many researchers have developed algorithms to solve the problem of stemming. Each researcher proposed his own methodology and measurements to test the performance and compute the accuracy of his algorithm. Thus, nobody can make accurate comparisons between these algorithms. Many generic conflation techniques and stemming algorithms are theoretically analyzed in this paper. Then, the main Arabic language characteristics that are necessary to be mentioned before discussing Arabic stemmers are summarized. The evaluation of the algorithms in this paper shows that Arabic stemming algorithm is still one of the most information retrieval challenges. This paper aims to compare the most of the commonly used light stemmers in terms of affixes lists, algorithms, main ideas, and information retrieval performance. The results show that the light10 stemmer outperformed the other stemmers. Finally, recommendations for future research regarding the development of a standard Arabic stemmer were presented.

Full Text