Abstract
In this paper, we show how to classify Arabic document images using a convolutional neural network, which is one of the most common supervised deep learning algorithms. The main goal of using deep learning is its ability to automatically extract useful features from images, which eliminates the need for a manual feature extraction process. Convolutional neural networks can extract features from images through a convolution process involving various filters. We collected a variety of Arabic document images from various sources and passed them into a convolutional neural network classifier. We adopt a VGG16 pre-trained network trained on ImageNet to classify the dataset of four classes as handwritten, historical, printed, and signboard. For the document image classification, we used VGG16 convolutional layers, ran the dataset through them, and then trained a classifier on top of it. We extract features by fixing the pre-trained network's convolutional layers, then adding the fully connected layers and training them on the dataset. We update the network with the addition of dropout by adding after each max-pooling layer and to the fourteen and the seventeenth layers which are the fully connected layers. The proposed approach achieved a classification accuracy of 92%.
Highlights
Documents classification is traditionally considered an important task and the first step in several document image processing pipelines, including document retrieval, information extraction, and text recognition
We develop a system for classifying Arabic document images into four classes: handwritten, historical, typed, and signboard
The pre-trained model VGG16, which was trained on ImageNet, was used in the Convolutional Neural Network (CNN) model
Summary
Documents classification is traditionally considered an important task and the first step in several document image processing pipelines, including document retrieval, information extraction, and text recognition. A wide range of classification problems can be solved using the deep learning technique. The creation of local and/or global image descriptors is the focus of the second category of work. These descriptors are used to categorize documents. The third category of methods employs CNN to automatically learn and extract features from document images, which are categorized. Several problems in image processing and understanding have been solved using deep learning methods, including document image classification, handwriting recognition, and blind image quality assessment. Various shallow-structure learning methods and handcrafted features were used to solve these issues [7]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Advanced Computer Science and Applications
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.