Abstract

Unconstrained off-line handwriting text recognition in general and for Arabic-like scripts in particular is a challenging task and is still an active research area. Transformer-based models for English handwriting recognition have recently shown promising results. In this paper, we have explored the use of transformer architecture for Urdu handwriting recognition. The use of a convolution neural network before a Vanilla full transformer and using Urdu printed text-lines along with handwritten text lines during the training are the highlights of the proposed work. The convolution layers act to reduce the spatial resolutions and compensate for the \(n^{2}\) complexity of transformer multi-head attention layers. Moreover, the printed text images in the training phase help the model in learning a greater number of ligatures (a prominent feature of Arabic-like scripts) and a better language model. Our model achieved state-of-the-art accuracy (CER of \(5.31\%\)) on publicly available NUST-UHWR dataset (Zia et al. in Neural Comput Appl 34:1–14, 2021).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call