A Survey on Arabic Text Classification Using Deep and Machine Learning Algorithms

Farah A Abdulghani,Nada A.Z Abdullah

doi:10.24996/ijs.2022.63.1.37

Abstract

Text categorization refers to the process of grouping text or documents into classes or categories according to their content. Text categorization process consists of three phases which are: preprocessing, feature extraction and classification. In comparison to the English language, just few studies have been done to categorize and classify the Arabic language. For a variety of applications, such as text classification and clustering, Arabic text representation is a difficult task because Arabic language is noted for its richness, diversity, and complicated morphology. This paper presents a comprehensive analysis and a comparison for researchers in the last five years based on the dataset, year, algorithms and the accuracy they got. Deep Learning (DL) and Machine Learning (ML) models were used to enhance text classification for Arabic language. Remarks for future work were concluded.

Highlights

Finding useful knowledge on a given subject in a vast volume of online textual data that is rapidly growing is a difficult challenge
The findings showed that Arabic text classification issue is very promising with deep learning classification models
convolution neural network (CNN) Arabic news is made up of 5070 documents and is divided into 6 classes: sport, SciTech, entertainment, middle east, business and world [2,3,4]. 4.2 The Preprocessing Some preprocessing is required to deal with text data to select features which are semantically represent the document and remove other features that are not

Summary

Introduction

Finding useful knowledge on a given subject in a vast volume of online textual data that is rapidly growing is a difficult challenge. El-Alami et al (2016) [4], for Arabic Text Categorization (ATC), they suggested an effective approach based on deep learning, using a deep stacked autoencoder that has word-count vectors as input. They used Restricted Boltzmann Machines (RBM) in the pre-training stage, to make the deep network, they unrolled the model and backpropagation is used during the fine-tuning stage. El-Alami et al (2020) [11] they proposed an Arabic text categorization method based on Bagof-Concepts and deep Autoencoder representations It incorporates explicit semantics relying on Arabic WordNet and exploits Chi-Square measures to select the most informative features.

Evaluation of the model

Conclusions

Future work

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Iraqi Journal of Science	Publication Date: Jan 30, 2022
Citations: 4	License type: cc-by

R Discovery Prime

R Discovery Prime

A Survey on Arabic Text Classification Using Deep and Machine Learning Algorithms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Iraqi Journal of Science

Lead the way for us

Similar Papers

Explainable artificial intelligence (XAI) for predicting the need for intubation in methanol-poisoned patients: a study comparing deep and machine learning models
Khadijeh Moulaei ... Seyed Mohammad Mortazavi
Scientific Reports | VOL. 14
Khadijeh Moulaei, et. al.Khadijeh Moulaei ... Seyed Mohammad Mortazavi
08 Jul 2024
Scientific Reports | VOL. 14

A novel deep-learning technique for forecasting oil price volatility using historical prices of five precious metals in context of green financing – A comparison of deep learning, machine learning, and statistical models
Muhammad Mohsin ... Fouad Jamaani
Resources Policy | VOL. 86
Muhammad Mohsin, et. al.Muhammad Mohsin ... Fouad Jamaani
01 Oct 2023
Resources Policy | VOL. 86

Forest Smoke-Fire Net (FSF Net): A Wildfire Smoke Detection Model That Combines MODIS Remote Sensing Images with Regional Dynamic Brightness Temperature Thresholds
Yunhong Ding ... Yujia Fu
Forests | VOL. 15
Yunhong Ding, et. al.Yunhong Ding ... Yujia Fu
10 May 2024
Forests | VOL. 15

Deep learning‐based smishing message identification using regular expression feature generation
Aakanksha Sharaff ... Siddhartha Shankar Paul
Expert Systems | VOL. 40
Aakanksha Sharaff, et. al.Aakanksha Sharaff ... Siddhartha Shankar Paul
05 Oct 2022
Expert Systems | VOL. 40

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Survey on Arabic Text Classification Using Deep and Machine Learning Algorithms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Iraqi Journal of Science