Arabic Handwritten Characters Recognition by Combining PHOG Descriptor with Ensemble Methods

M Dahbali,N Lamghari,Noureddine Aboutabit

doi:10.1007/978-3-031-29313-9_13

Abstract

Handwritten character recognition is a system that is widely used in the modern world for a variety of applications in various fields, but it is still difficult in the case of arabic language. The arabic alphabet is distinguished by the fact that many characters have similar shapes but differ in where the dots are placed in relation to the main body of the character. Furthermore, some people handwrite these dots as dashes, complicating the task of a recognition system. Over the last three decades, this field of study has received a lot of attention. However, it still faces several challenges, such as the variation in human handwriting and its cursive nature. In this paper, we created a new system with a PHOG descriptor inspired by the pyramid representation and the Histogram of Orientation Gradients. The performance of most machine learning models depends on the quality, quantity, and relevance of the data. However, insufficient data is one of the most common challenges in machine learning implementations, as collecting this data can be expensive and time consuming in many cases, for this purpose we tested various parameters of this descriptor and used a data augmentation method with Gaussian Noise, which significantly increased accuracy. For the classification phase, we used five-fold cross-validation, which is a statistical method used to estimate the skill of machine learning models. K-nearest neighbors, decision trees, random forests and naive bayes are the four machine learning techniques we used to train and evaluate our system. We used ensemble methods mainly bagging and stacking to evaluate the impact of combining multiple models. We tested our system using the AlexU Isolated Alphabet (AIA9K) Dataset, which contains 8,737 characters. When compared to other systems using the same database, the results of the suggested system are promising. The experiments revealed that using classification with stacking algorithm outperforms classification with individual classifiers and bagging algorithm when naive bayes, k-nearest neighbors and decision tree are used as Meta classifiers. We were able to achieve a high classification rate of 97.22% with random forest.

Full Text