Arabic Document Classification by Deep Learning

Taghreed Alghamdi,Samia Snoussi,Lobna Hsairi

doi:10.14569/ijacsa.2021.0121034

Abstract

In this paper, we show how to classify Arabic document images using a convolutional neural network, which is one of the most common supervised deep learning algorithms. The main goal of using deep learning is its ability to automatically extract useful features from images, which eliminates the need for a manual feature extraction process. Convolutional neural networks can extract features from images through a convolution process involving various filters. We collected a variety of Arabic document images from various sources and passed them into a convolutional neural network classifier. We adopt a VGG16 pre-trained network trained on ImageNet to classify the dataset of four classes as handwritten, historical, printed, and signboard. For the document image classification, we used VGG16 convolutional layers, ran the dataset through them, and then trained a classifier on top of it. We extract features by fixing the pre-trained network's convolutional layers, then adding the fully connected layers and training them on the dataset. We update the network with the addition of dropout by adding after each max-pooling layer and to the fourteen and the seventeenth layers which are the fully connected layers. The proposed approach achieved a classification accuracy of 92%.

Highlights

Documents classification is traditionally considered an important task and the first step in several document image processing pipelines, including document retrieval, information extraction, and text recognition
We develop a system for classifying Arabic document images into four classes: handwritten, historical, typed, and signboard
The pre-trained model VGG16, which was trained on ImageNet, was used in the Convolutional Neural Network (CNN) model

Summary

Introduction

Documents classification is traditionally considered an important task and the first step in several document image processing pipelines, including document retrieval, information extraction, and text recognition. A wide range of classification problems can be solved using the deep learning technique. The creation of local and/or global image descriptors is the focus of the second category of work. These descriptors are used to categorize documents. The third category of methods employs CNN to automatically learn and extract features from document images, which are categorized. Several problems in image processing and understanding have been solved using deep learning methods, including document image classification, handwriting recognition, and blind image quality assessment. Various shallow-structure learning methods and handcrafted features were used to solve these issues [7]

Objectives

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Advanced Computer Science and Applications	Publication Date: Jan 1, 2021
Citations: 1	License type: cc-by

R Discovery Prime

R Discovery Prime

Arabic Document Classification by Deep Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications

Lead the way for us

Similar Papers

Clinically Relevant Vulnerabilities of Deep Machine Learning Systems for Skin Cancer Diagnosis
Xinyi Du-Harpur ... Magnus D Lynch
Journal of Investigative Dermatology | VOL. 141
Xinyi Du-Harpur, et. al.Xinyi Du-Harpur ... Magnus D Lynch
12 Sep 2020
Journal of Investigative Dermatology | VOL. 141

Development of deep learning model for prediction of chemotherapy response using PET images and radiomics features
Wook Kim ... Sang-Keun Woo
-
Wook Kim, et. al.Wook Kim ... Sang-Keun Woo
01 Nov 2018
01 Nov 2018

Real Time 3D Pose Estimation of Both Human Hands via RGB-Depth Camera and Deep Convolutional Neural Networks
Geon Gi ... Tae Yeon Kim
-
Geon Gi, et. al.Geon Gi ... Tae Yeon Kim
06 Jun 2019
06 Jun 2019

Vision system with deep learning classifiers for automatic quality inspection
Jayesh M Rathod ... Hassan S Salehi
-
Jayesh M Rathod, et. al.Jayesh M Rathod ... Hassan S Salehi
29 Apr 2020
29 Apr 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Arabic Document Classification by Deep Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications