Abstract

<p class="0abstract">In this paper, we introduce a multi-stage offline holistic handwritten Arabic text recognition model using the Local Binary Pattern (LBP) technique and two machine-learning approaches; Support Vector Machines (SVM) and Artificial Neural Network (ANN). In this model, the LBP method is utilized for extracting the global text features without text segmentation. The suggested model was tested and utilized on version II of the IFN/ENIT database applying the polynomial, linear, and Gaussian SVM and ANN classifiers. Performance of the ANN was assessed using the Levenberg-Marquardt (LM), Bayesian Regularization (BR), and Scaled Conjugate Gradient (SCG) training methods. The classification outputs of the herein suggested model were compared and verified with the results obtained from two benchmark Arabic text recognition models (ATRSs) that are based on the Discrete Cosine Transform (DCT) and Principal Component Analysis (PCA) methods using various normalization sizes of images of Arabic text. The classification outcomes of the suggested model are promising and better than the outcomes of the examined benchmarks models. The best classification accuracies of the suggested model (97.46% and 94.92%) are obtained using the polynomial SVM classifier and the BR ANN training methods, respectively.</p>

Highlights

  • Simulating human understanding for text processing such that a given computer system can read, understand and process text in a way similar to human mind, is the ultimate goal of any Handwritten Arabic Text Recognition System (HATRS) [1]

  • The major contributions of this study are: (i) extraction of the handwritten Arabic text features by applying the Local Binary Pattern (LBP) method; (ii) evaluation of the derived features on version 2 of the IFN/ENIT database of handwritten Arabic text using two machinelearning approaches (SVM and Artificial Neural Network (ANN)), where the proposed model is tested using the polynomial, linear, and Gaussian Support Vector Machine (SVM) classifiers and three ANN training methods (Levenberg-Marqurdt (LM), Bayesian Regularization (BR), and Scaled Conjugate Gradient (SCG)); and (iii) holding a comparison in recognition accuracy between the proposed HATRS and two benchmark HATRSs, one depending on Discrete Cosine Transform (DCT) and one depending on Principal Component Analysis, which have been developed by Al-Saqqar et al [6]

  • To verify performance of the suggested HATRS, the recognition outcomes of the proposed system were compared with the results of two benchmark ATRSs, one developed by [6] based on Principal Component Analysis (PCA) and another based on DCT

Read more

Summary

Introduction

Simulating human understanding for text processing such that a given computer system can read, understand and process text in a way similar to human mind, is the ultimate goal of any Handwritten Arabic Text Recognition System (HATRS) [1]. The offline segmentation-free (holistic) Arabic text recognition system is composed of four phases: image acquisition, image preprocessing, feature classification (recognition), and feature extraction. This paper proposes a novel multi-stage, HATRS m based on the Local Binary Pattern (LBP) and two machine-learning approaches: SVM and ANN This model progresses in four steps: skeleton extraction, text image normalization, LBP for feature extraction, and classification via the polynomial, linear, and Gaussian SVM and ANN classifiers. The major contributions of this study are: (i) extraction of the handwritten Arabic text features by applying the LBP method; (ii) evaluation of the derived features on version 2 of the IFN/ENIT database of handwritten Arabic text using two machinelearning approaches (SVM and ANN), where the proposed model is tested using the polynomial, linear, and Gaussian SVM classifiers and three ANN training methods (Levenberg-Marqurdt (LM), Bayesian Regularization (BR) , and Scaled Conjugate Gradient (SCG)); and (iii) holding a comparison in recognition accuracy between the proposed HATRS and two benchmark HATRSs, one depending on Discrete Cosine Transform (DCT) and one depending on Principal Component Analysis, which have been developed by Al-Saqqar et al [6].

Related Works
The Proposed Model
The skeleton extraction
Normalization
Feature extraction using LBP
Text classification
Experimental Results and Discussion
Conclusions and Future Directions
Authors
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call