Handwritten Arabic character recognition remains a challenging task in pattern recognition due to the inherent complexities of the cursive script and visual similarities between characters. While deep learning techniques have demonstrated promising results in this domain, further enhancements to the model architecture can drive even greater performance improvements. This study introduces a hybrid deep learning approach that combines Convolutional Neural Networks (CNNs) with Bidirectional Recurrent Neural Networks, specifically Bidirectional Long Short-Term Memory (Bi-LSTM) and Bidirectional Gated Recurrent Units (Bi-GRU). By leveraging the strengths of both convolutional and recurrent neural network components, the proposed models are able to effectively capture spatial features as well as model the temporal dynamics and contextual relationships present in handwritten Arabic text. Experiments conducted on the AHCD and Hijjaa benchmark datasets show that the CNN-Bi-GRU framework achieved state-of-the-art accuracy rates of 97.05% and 91.78% respectively, outperforming previous deep learning-based methods. These results demonstrate the significant performance gains that can be achieved by integrating specialized temporal modeling and contextual representation capabilities into the handwriting recognition pipeline, without the need for explicit segmentation. The findings of this research represent a crucial advancement in the continued development of sophisticated and precise deep learning systems for Arabic handwriting recognition, with broad applications across domains that rely on efficient text extraction from handwritten documents.
Read full abstract