Abstract

Transformer neural networks have increasingly be-come the neural network design of choice, having recently been shown to outperform state-of-the-art end-to-end (E2E) recurrent neural networks (RNNs). Transformers utilize a self-attention mechanism to relate input frames and extract more expressive sequence representations. Transformers also provide parallelism computation and the ability to capture long dependencies in contexts over RNNs. This work introduces a transformer-based model for the online handwriting recognition (OnHWR) task. As the transformer follows encoder-decoder architecture, we investigated the self-attention encoder (SAE) with two different decoders: a self-attention decoder (SAD) and a connectionist temporal classification (CTC) decoder. The proposed models can recognize complete sentences without the need to integrate with external language modules. We tested our proposed mod-els against two Arabic online handwriting datasets: Online-KHATT and CHAW. On evaluation, SAE-SAD architecture per-formed better than SAE-CTC architecture. The SAE-SAD model achieved a 5% character error rate (CER) and an 18%word error rate (WER) against the CHAW dataset, and a 22% CER and a 56% WER against the Online-KHATT dataset. The SAE-SAD model showed significant improvements over existing models of the Arabic OnHWR.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call