Arabic Handwritten Word Recognition Based on Stationary Wavelet Transform Technique using Machine Learning

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

This paper is aimed at improving the performance of the word recognition system (WRS) of handwritten Arabic text by extracting features in the frequency domain using the Stationary Wavelet Transform (SWT) method using machine learning, which is a wavelet transform approach created to compensate for the absence of translation invariance in the Discrete Wavelets Transform (DWT) method. The proposed SWT-WRS of Arabic handwritten text consists of three main processes: word normalization, feature extraction based on SWT, and recognition. The proposed SWT-WRS based on the SWT method is evaluated on the IFN/ENIT database applying the Gaussian, linear, and polynomial support vector machine, the k-nearest neighbors, and ANN classifiers. ANN performance was assessed by applying the Bayesian Regularization (BR) and Levenberg-Marquardt (LM) training methods. Numerous wavelet transform (WT) families are applied, and the results prove that level 19 of the Daubechies family is the best WT family for the proposed SWT-WRS. The results also confirm the effectiveness of the proposed SWT-WRS in improving the performance of handwritten Arabic word recognition using machine learning. Therefore, the suggested SWT-WRS overcomes the lack of translation invariance in the DWT method by eliminating the up-and-down samplers from the proposed machine learning method.

Similar Papers
  • Conference Article
  • 10.1109/jeeit53412.2021.9634149
Forensic Handwriting Identification System for the Arabic Language Based on Stationary Wavelet Transform (SWT) Fusion Technique
  • Nov 16, 2021
  • Amjad H Alkilani + 1 more

Forensic-handwriting text analysis aims to link two similar texts, an input text and a stored handwritten text, from a specific suspect using distinctive features, such as motion, hand pressure, and character shape. There are various forensic handwriting text analysis systems that aim to analyze English handwritten text. However, scarcely any of them are proposed to analyze the Arabic handwritten texts. Hence, the examiners or inspectors are forced to manually analyze the handwritten text. This can be tedious and time-consuming for inspectors. This study proposes an offline multistage forensic handwriting identification system for the Arabic language based on Stationary Wavelet Transform (SWT) Fusion Technique to facilitate and reduce the time required for the inspectors or forensic examiners to find similarities in handwritten texts. The proposed system has four main processes: Normalization and preprocessing feature extraction using Truncated Singular Value Decomposition (TSVD), Sparse Random Projection (SRP), and feature fusion using SWT, recognition using Polynomial, Linear and Gaussian SVM classifiers. The accuracy of the proposed system is evaluated using the IFN/ENIT dataset of handwritten Arabic text using Polynomial, Linear, and Gaussian SVM classifiers. Moreover, the accuracy result of the proposed system is compared with the accuracy result produced by a state-of-the-art HATRS, which is based on Local Binary Pattern and SVM classifiers using several normalization sizes of Arabic text images. The experiential result shows the effectiveness of the proposed system compared to the HATRS model. The best classification accuracy of the proposed system (98.83%) is obtained using the Gaussian SVM classifier.

  • Dissertation
  • Cite Count Icon 3
  • 10.4995/thesis/10251/53029
Arabic Text Recognition and Machine Translation
  • Jul 13, 2015
  • Ihab Alkhoury

Research on Arabic Handwritten Text Recognition (HTR) and Arabic-English Machine Translation (MT) has been usually approached as two independent areas of study. However, the idea of creating one system that combines both areas together, in order to generate English translation out of images containing Arabic text, is still a very challenging task. This process can be interpreted as the translation of Arabic images. In this thesis, we propose a system that recognizes Arabic handwritten text images, and translates the recognized text into English. This system is built from the combination of an HTR system and an MT system. Regarding the HTR system, our work focuses on the use of Bernoulli Hidden Markov Models (BHMMs). BHMMs had proven to work very well with Latin script. Indeed, empirical results based on it were reported on well-known corpora, such as IAM and RIMES. In this thesis, these results are extended to Arabic script, in particular, to the well-known IfN/ENIT and NIST OpenHaRT databases for Arabic handwritten text. The need for transcribing Arabic text is not only limited to handwritten text, but also to printed text. Arabic printed text might be considered as a simple form of handwritten text version. Thus, for this kind of text, we also propose Bernoulli HMMs. In addition, we propose to compare BHMMs with state-of-the-art technology based on neural networks. A key idea that has proven to be very effective in this application of Bernoulli HMMs is the use of a sliding window of adequate width for feature extraction. This idea has allowed us to obtain very competitive results in the recognition of both Arabic handwriting and printed text. Indeed, a system based on it ranked first at the ICDAR 2011 Arabic recognition competition on the Arabic Printed Text Image (APTI) database. Moreover, this idea has been refined by using repositioning techniques for extracted windows, leading to further improvements in Arabic text recognition. In the case of handwritten text, this refinement improved our system which ranked first at the ICFHR 2010 Arabic handwriting recognition competition on IfN/ENIT. In the case of printed text, this refinement led to an improved system which ranked second at the ICDAR 2013 Competition on Multi-font and Multi-size Digitally Represented Arabic Text on APTI. Furthermore, this refinement was used with neural networks-based technology, which led to state-of-the-art results. For machine translation, the system was based on the combination of three state-of-the-art statistical models: the standard phrase-based models, the hierarchical phrase-based models, and the N-gram phrase-based models. This combination was done using the Recognizer Output Voting Error Reduction (ROVER) method. Finally, we propose three methods of combining HTR and MT to develop an Arabic image translation system. The system was evaluated on the NIST OpenHaRT database, where competitive results were obtained.

  • Conference Article
  • 10.1109/jeeit53412.2021.9634154
An Automatic Paleography Script Recognition System for the Arabic Language based on Fast Independent Component Analysis (Fast-ICA) and Support Vector Machine (SVM)
  • Nov 16, 2021
  • Amjad H Alkilani + 1 more

Paleography is the study of historical writing and its concern with identifying the date, origin, author(s), and other information about a particular script. There are many paleography text analysis system aims to analyze English handwritten text. However, scarcely any of them are proposed to analyze the Arabic handwritten texts. Hence, the Arabic paleographers are forced to manually analyzes the handwritten text. To facilitate and reduce the amount of time required to analyze the scripts for paleographers and archaeologists. An Automatic Paleography Script Recognition (APSR) System for the Arabic language is proposed in this study. The APSR has three main steps: preprocessing, feature extraction using Fast Independent Component Analysis (Fast-ICA), and recognition using Polynomial, Linear, and Gaussian Support Vector Machine (SVM) classifiers. In the proposed system, the images of the scripts are normalized into a uniform size at first. Afterward, the image noise is reduced using Gaussian blur. Subsequently, the script skeleton is extracted using the Erosion operator for feature extraction and selection of the handwritten script using Fast-ICA. The accuracy of the proposed system is evaluated on version 2 of the IFN/ENIT dataset of handwritten Arabic text using Polynomial, Linear, and Gaussian SVM classifiers. Moreover, the accuracy result of the proposed system is compared with the accuracy result produced by a state-of-the-art OHATRS, which is based on Principle Component Analysis (PCA) and SVM classifiers using several normalization sizes of Arabic text images. The experiential result shows the effectiveness of the proposed system compared to the OHATRS model.

  • Research Article
  • Cite Count Icon 5
  • 10.3390/app14199020
Machine Learning Approach for Arabic Handwritten Recognition
  • Oct 6, 2024
  • Applied Sciences
  • A M Mutawa + 2 more

Text recognition is an important area of the pattern recognition field. Natural language processing (NLP) and pattern recognition have been utilized efficiently in script recognition. Much research has been conducted on handwritten script recognition. However, the research on the Arabic language for handwritten text recognition received little attention compared with other languages. Therefore, it is crucial to develop a new model that can recognize Arabic handwritten text. Most of the existing models used to acknowledge Arabic text are based on traditional machine learning techniques. Therefore, we implemented a new model using deep machine learning techniques by integrating two deep neural networks. In the new model, the architecture of the Residual Network (ResNet) model is used to extract features from raw images. Then, the Bidirectional Long Short-Term Memory (BiLSTM) and connectionist temporal classification (CTC) are used for sequence modeling. Our system improved the recognition rate of Arabic handwritten text compared to other models of a similar type with a character error rate of 13.2% and word error rate of 27.31%. In conclusion, the domain of Arabic handwritten recognition is advancing swiftly with the use of sophisticated deep learning methods.

  • Research Article
  • Cite Count Icon 13
  • 10.14569/ijacsa.2019.0101227
Handwritten Arabic Text Recognition using Principal Component Analysis and Support Vector Machines
  • Jan 1, 2019
  • International Journal of Advanced Computer Science and Applications
  • Faisal Al-Saqqar + 3 more

In this paper, an offline holistic handwritten Arabic text recognition system based on Principal Component Analysis (PCA) and Support Vector Machine (SVM) classifiers is proposed. The proposed system consists of three primary stages: preliminary processing, feature extraction using PCA, and classification using the polynomial, linear, and Gaussian SVM classifiers. In this proposed system, text skeleton is first extracted and the images of the text are normalized into uniform size for extraction of the global features of the Arabic words using PCA. Recognition performance of this proposed system was evaluated on version 2 of the IFN/ENIT database of handwritten Arabic text using the polynomial, linear, and Gaussian SVM classifiers. The classification results of the proposed system were compared with the results produced by a benchmark. TRS that is depending on the Discrete Cosine Transform (DCT) method using numerous normalization sizes of Arabic text images. The experimental testing results support the effectiveness of the proposed system in holistic recognition of the handwritten Arabic text.

  • Research Article
  • Cite Count Icon 11
  • 10.3991/ijim.v14i16.16005
A Holistic Model for Recognition of Handwritten Arabic Text Based on the Local Binary Pattern Technique
  • Sep 22, 2020
  • International Journal of Interactive Mobile Technologies (iJIM)
  • Atallah Al-Shatnawi + 2 more

<p class="0abstract">In this paper, we introduce a multi-stage offline holistic handwritten Arabic text recognition model using the Local Binary Pattern (LBP) technique and two machine-learning approaches; Support Vector Machines (SVM) and Artificial Neural Network (ANN). In this model, the LBP method is utilized for extracting the global text features without text segmentation. The suggested model was tested and utilized on version II of the IFN/ENIT database applying the polynomial, linear, and Gaussian SVM and ANN classifiers. Performance of the ANN was assessed using the Levenberg-Marquardt (LM), Bayesian Regularization (BR), and Scaled Conjugate Gradient (SCG) training methods. The classification outputs of the herein suggested model were compared and verified with the results obtained from two benchmark Arabic text recognition models (ATRSs) that are based on the Discrete Cosine Transform (DCT) and Principal Component Analysis (PCA) methods using various normalization sizes of images of Arabic text. The classification outcomes of the suggested model are promising and better than the outcomes of the examined benchmarks models. The best classification accuracies of the suggested model (97.46% and 94.92%) are obtained using the polynomial SVM classifier and the BR ANN training methods, respectively.</p>

  • Research Article
  • Cite Count Icon 25
  • 10.3390/ijerph18062954
Identifying the Risk Factors Associated with Nursing Home Residents' Pressure Ulcers Using Machine Learning Methods.
  • Mar 13, 2021
  • International Journal of Environmental Research and Public Health
  • Soo-Kyoung Lee + 4 more

Background: Machine learning (ML) can keep improving predictions and generating automated knowledge via data-driven predictors or decisions. Objective: The purpose of this study was to compare different ML methods including random forest, logistics regression, linear support vector machine (SVM), polynomial SVM, radial SVM, and sigmoid SVM in terms of their accuracy, sensitivity, specificity, negative predictor values, and positive predictive values by validating real datasets to predict factors for pressure ulcers (PUs). Methods: We applied representative ML algorithms (random forest, logistic regression, linear SVM, polynomial SVM, radial SVM, and sigmoid SVM) to develop a prediction model (N = 60). Results: The random forest model showed the greatest accuracy (0.814), followed by logistic regression (0.782), polynomial SVM (0.779), radial SVM (0.770), linear SVM (0.767), and sigmoid SVM (0.674). Conclusions: The random forest model showed the greatest accuracy for predicting PUs in nursing homes (NHs). Diverse factors that predict PUs in NHs including NH characteristics and residents’ characteristics were identified according to diverse ML methods. These factors should be considered to decrease PUs in NH residents.

  • Book Chapter
  • Cite Count Icon 3
  • 10.1093/obo/9780199772810-0245
Visual Word Recognition
  • Aug 28, 2019
  • Melvin J. Yap

Words are the building blocks of language, and visual word recognition is a crucial prerequisite for skilled reading. Before we can pronounce a word or understand what it means, we have to first recognize it (i.e., the visually presented word makes contact with its underlying mental representation). Although several tasks have been developed to tap word recognition performance, researchers have primarily relied on lexical decision (classifying letter strings as words or nonwords), speeded pronunciation (reading a word or nonword aloud), and semantic classification (e.g., classifying a word as animate or inanimate). Despite the apparent ease of visual word recognition, the processes that support the mapping of spelling-to-sound and spelling-to-meaning are far from perfectly understood and remain the object of active investigations. Beyond shedding light on reading, literacy, and language development, the visual word recognition literature has helped inform our understanding of other cognitive domains (e.g., pattern recognition, attention, memory), while propelling advances in computational modeling and cognitive neuroscience. Because words can be coded and analyzed at multiple levels (e.g., orthography, phonology, semantics), much of empirical research has explored the functional relationships between orthographic, phonological, and semantic variables and word recognition performance across lexical processing tasks. In addition to studying the recognition of isolated words, there is a rich literature examining how different prime contexts influence the processing of subsequently presented words. Such primes can be orthographically, phonologically, semantically, or morphologically related to targets and are either visible or masked (i.e., presented so briefly that conscious perception is minimized). Turning to methodology, although the classical factorial design continues to dominate word recognition research, an increasing amount of work has been leveraging on the megastudy approach, whereby researchers examine word recognition performance for large sets of words, which are defined by the language rather than by the experimenter. Collectively, the basic findings from the isolated and primed visual word recognition performance have been used to develop and constrain increasingly powerful computational models of word recognition and task performance. Moving forward, the visual word recognition literature is likely to be increasingly characterized by studies that rely on powerful analytical tools (e.g., linear mixed effects analyses, analysis of response time distributions) and which give more consideration to the role of individual differences. Finally, in light of space constraints, this article focuses on references that deal with how visually presented English words are recognized. There is an important and growing literature that explores the lexical processing of other alphabetic (e.g., Spanish, French, German) and nonalphabetic (e.g., Chinese, Korean) languages and the interplay between languages in the multilingual lexicon.

  • Research Article
  • Cite Count Icon 25
  • 10.1109/access.2020.3035884
Method for Classifying a Noisy Raman Spectrum Based on a Wavelet Transform and a Deep Neural Network
  • Jan 1, 2020
  • IEEE Access
  • Liangrui Pan + 5 more

Because it is relatively difficult in practice to classify the Raman spectrum under baseline noise and additive white Gaussian noise environments, this paper proposes a new framework based on a wavelet transform and deep neural network for identification of noisy Raman spectra. The framework consists of two main engines. Wavelet transform is proposed as the framework front end for transforming the 1-D noise Raman spectrum to two-dimensional data. The two-dimensional data are fed to the framework back end, which is a classifier. The optimum classifier is chosen by implementing several traditional machine learning (ML) and deep learning (DL) algorithms, and we investigate their classification accuracy and robustness performances. The four chosen MLs are naive Bayes (NB), a support vector machine (SVM), a random forest (RF) and a k-nearest neighbor (KNN), and a deep convolution neural network (DCNN) was chosen as a DL classifier. Noise-free, Gaussian noise, baseline noise, and mixed-noise Raman spectra were applied to train and validate the ML and DCNN models. The optimum back-end classifier was obtained by testing the ML and DCNN models with several noisy Raman spectra (10-30 dB noise power). Based on the simulation, the accuracy of the DCNN classifier is 9% higher than that of the NB classifier, 3.5% higher than the RF classifier, 1% higher than the KNN classifier, and 0.5% higher than the SVM classifier. In terms of robustness to mixed noise scenarios, the framework with the DCNN back end showed superior performance compared with the other ML back ends. The DCNN back end achieved 90% accuracy at 3 dB SNR, while the NB, SVM, RF, and K-NN back ends required 27 dB, 22 dB, 27 dB, and 23 dB SNR, respectively. In addition, in the low-noise test dataset, the F-measure score of the DCNN back end exceeded 99.1%, and the F-measure scores of the other ML engines were below 98.7%.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 39
  • 10.3390/ijerph17176234
Application of Machine Learning Methods in Nursing Home Research.
  • Aug 27, 2020
  • International Journal of Environmental Research and Public Health
  • Soo-Kyoung Lee + 3 more

Background: A machine learning (ML) system is able to construct algorithms to continue improving predictions and generate automated knowledge through data-driven predictors or decisions. Objective: The purpose of this study was to compare six ML methods (random forest (RF), logistics regression, linear support vector machine (SVM), polynomial SVM, radial SVM, and sigmoid SVM) of predicting falls in nursing homes (NHs). Methods: We applied three representative six-ML algorithms to the preprocessed dataset to develop a prediction model (N = 60). We used an accuracy measure to evaluate prediction models. Results: RF was the most accurate model (0.883), followed by the logistic regression model, SVM linear, and polynomial SVM (0.867). Conclusions: RF was a powerful algorithm to discern predictors of falls in NHs. For effective fall management, researchers should consider organizational characteristics as well as personal factors. Recommendations for Future Research: To confirm the superiority of ML in NH research, future studies are required to discern additional potential factors using newly introduced ML methods.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 20
  • 10.1016/j.rsase.2023.101110
New approach for predicting nitrogen and pigments in maize from hyperspectral data and machine learning models
  • Nov 27, 2023
  • Remote Sensing Applications: Society and Environment
  • Bianca Cavalcante Da Silva + 9 more

Fast diagnostics from hyperspectral data and machine learning (ML) models to predict nitrogen (N) and pigment content in maize crops is challenging to optimize nitrogen fertilization. This research assessed the efficiency of the five ML algorithms, the best phenological stage, and the sensitivity of the 90 spectra to estimate N and pigment content. Therefore, this field research proposes as a novelty to test which of the five ML algorithms accurately estimates nitrogen, chlorophyll, and carotenoid content in maize leaves at different phenological stages using hyperspectral band data. The treatments were arranged in a factorial scheme with four N doses (0, 54, 108, and 216 kg ha−1) combined with five leaf collection seasons at phenological stages V6 to V14. The ML models tested were artificial neural networks – ANN, decision tree adapted for prediction problems – M5P, REPTree decision tree, random forest - RF, polynomial support vector machine – PSVM, and ZeroR - ZR (control). Spectral bands 530–560 nm and 690–750 nm are effective wavelengths because the visible region with lower reflectance (530–560 nm) affects N uptake and chlorophyll and carotenoid content, while the red-edge and near-infrared region affects N and chlorophyll content. The random forest (RF) model performed better with higher correlation (r) and mean absolute error (MAE) between predicted and observed values for all variables, with the correlation coefficient (r) value being around 0.6 and the MAE below 0.5 for the prediction of chlorophyll a+b. For the prediction of flavonoids, the r was around 0.6 and the error was 0.07. Support vector machine (SVM) and RF efficiently predicted nitrogen content, in predicting of NF, the r values for both algorithms were above 0.35 and the error was below 2.75.

  • Research Article
  • Cite Count Icon 21
  • 10.1016/j.image.2022.116827
KOHTD: Kazakh offline handwritten text dataset
  • Jul 16, 2022
  • Signal Processing: Image Communication
  • Nazgul Toiganbayeva + 6 more

KOHTD: Kazakh offline handwritten text dataset

  • Research Article
  • Cite Count Icon 54
  • 10.1016/j.asoc.2008.08.006
Region growing based segmentation algorithm for typewritten and handwritten text recognition
  • Aug 31, 2008
  • Applied Soft Computing
  • Khalid Saeed + 1 more

Region growing based segmentation algorithm for typewritten and handwritten text recognition

  • Research Article
  • Cite Count Icon 64
  • 10.1016/j.proeng.2013.09.156
ANN based Evaluation of Performance of Wavelet Transform for Condition Monitoring of Rolling Element Bearing
  • Jan 1, 2013
  • Procedia Engineering
  • H.S Kumar + 3 more

ANN based Evaluation of Performance of Wavelet Transform for Condition Monitoring of Rolling Element Bearing

  • Research Article
  • Cite Count Icon 138
  • 10.1145/2431211.2431222
Offline arabic handwritten text recognition
  • Feb 1, 2013
  • ACM Computing Surveys
  • Mohammad Tanvir Parvez + 1 more

Research in offline Arabic handwriting recognition has increased considerably in the past few years. This is evident from the numerous research results published recently in major journals and conferences in the area of handwriting recognition. Features and classifications techniques utilized in recent research work have diversified noticeably compared to the past. Moreover, more efforts have been diverted, in last few years, to construct different databases for Arabic handwriting recognition. This article provides a comprehensive survey of recent developments in Arabic handwriting recognition. The article starts with a summary of the characteristics of Arabic text, followed by a general model for an Arabic text recognition system. Then the used databases for Arabic text recognition are discussed. Research works on preprocessing phase, like text representation, baseline detection, line, word, character, and subcharacter segmentation algorithms, are presented. Different feature extraction techniques used in Arabic handwriting recognition are identified and discussed. Different classification approaches, like HMM, ANN, SVM, k-NN, syntactical methods, etc., are discussed in the context of Arabic handwriting recognition. Works on Arabic lexicon construction and spell checking are presented in the postprocessing phase. Several summary tables of published research work are provided for used Arabic text databases and reported results on Arabic character, word, numerals, and text recognition. These tables summarize the features, classifiers, data, and reported recognition accuracy for each technique. Finally, we discuss some future research directions in Arabic handwriting recognition.

Save Icon
Up Arrow
Open/Close