Abstract

Offline Handwritten Text Recognition (HTR) is a task that offers a challenge in computer vision, where images are the only source of information. In fact, several approaches to optical models have been developed, such as through of Hidden Markov Model (HMM) or recurrent Bidirectional/Multidimensional layers. The current state-of-the-art consists of combined deep learning techniques, the Convolutional Recurrent Neural Networks (CRNN), in which recurrent layers still suffer from vanishing gradient problem when processing very long texts. In a way, high-performance models generally have millions of trainable parameters and a high computational cost. However, recently a new optical model architecture, Gated-CNN, demonstrated improvements to complement CRNN modeling. Thus, in this work, we present a new small architecture for HTR (based on Gated-CNN) integrated with two steps of language model at the character and word levels, respectively. Therefore, we used 9 state-of-the-art approaches and validated the results using the IAM public dataset. Finally, the proposed model surpasses the results obtained by different approaches in the literature, reaching recognition rates of CER 2.7% and WER 5.6%, which means an improvement of 13% over the best results on IAM dataset.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call