Arabic Text Classification: A Review

Adel Hamdan Mohammad

doi:10.5539/mas.v13n5p88

Abstract

Text classification is an important topic. The number of electronic documents available on line is massive. Text classification aims to classify documents into a set of predefined categories.&nbsp; Number of researches conducted on English dataset is great in comparison with number of researches done using Arabic dataset. This research could be considered as reference for most researchers who deal with Arabic dataset. This research used the most well-known algorithms used in text classification with Arabic dataset. Besides that, dataset used in this research is large enough in comparison with most dataset for Arabic language used in other researches. In addition, this research used different selections and weighting methods for documents. I expect that all researchers who would write researches using Arabic dataset will find this work helpful. Algorithms used in this research are na&iuml;ve Bayesian, support vector machines, artificial neural networks, k- nearest neighbors, C4.5 decision tree and rocchio classifier.

Highlights

No doubt that the massive number of available electronic documents make text classification (TC) one of the most critical topics
One of the main problems of text classification for both English and Arabic language in general is lacking the availability of general dataset which can be used as benchmark
Readers can find a lot of researches talk about text classification using English dataset

Summary

Introduction

No doubt that the massive number of available electronic documents make text classification (TC) one of the most critical topics. (Adel Hamdan,2011; Raed Abu Zitar,2011; Adel Hamdan,2013) Text classification is not an easy process since sometimes there are a great number of available information in document. Besides that, this information may have a high diversity. A huge number of researches can be found in English dataset text classification. (L.Borrajo,2015; Adel Hamdan,2016; Adel Hamdan, 2018) But the number of researches and experiments done using Arabic dataset still not enough. In this research the author applies the most well-known text classification methods and applies his experiments using Arabic dataset.

Naïve Bayesian

Support Vector Machine

Artificial Neural Networks

K-Nearest Neighbor

Rocchio Classifier

Arabic Language

10. Related Studies

11. Dataset

12. Experiments and Analysis

13. Conclusion and Future Work

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Modern Applied Science	Publication Date: Apr 30, 2019
Citations: 11	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Arabic Text Classification: A Review

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Modern Applied Science

Lead the way for us

Similar Papers

Arabic text classification using Polynomial Networks
Mayy M Al-Tahrawi ... Sumaya N Al-Khatib
Journal of King Saud University - Computer and Information Sciences | VOL. 27
Mayy M Al-Tahrawi, et. al.Mayy M Al-Tahrawi ... Sumaya N Al-Khatib
10 Sep 2015
Journal of King Saud University - Computer and Information Sciences | VOL. 27

Arabic Text Categorization Using Support vector machine, Naïve Bayes and Neural Network
Adel Hamdan Mohammad ... Tariq Alwada‘N
GSTF Journal on Computing (JoC) | VOL. 5
Adel Hamdan Mohammad, et. al.Adel Hamdan Mohammad ... Tariq Alwada‘N
01 Sep 2016
GSTF Journal on Computing (JoC) | VOL. 5

Improving Arabic Text Classification Using P-Stemmer
Tarek Kanan ... Shadi Alzubi
Recent Advances in Computer Science and Communications | VOL. 15
Tarek Kanan, et. al.Tarek Kanan ... Shadi Alzubi
01 Mar 2022
Recent Advances in Computer Science and Communications | VOL. 15

Investigating the relevance of Arabic text classification datasets based on supervised learning
Ahmad Hussein Ababneh
Journal of Electronic Science and Technology | VOL. 20
Ahmad Hussein AbabnehAhmad Hussein Ababneh
01 Jun 2022
Journal of Electronic Science and Technology | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Arabic Text Classification: A Review

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Modern Applied Science