Abstract

In this paper, we evaluated the recognition performance of CRNN (Convolutional Recurrent Neural Network) on Indian language printed documents. We compared the current CRNN performance with MDLSTM (2 dimensional LSTM) and Tesseract (LSTM) on same test bed. The CRNN outperformed the previous approaches. Several experiments are done on 7 Indian languages i.e. Hindi, Marathi, Tamil, Kannada, Malayalam, Bangla and Gurumukhi. This CRNN architecture is feed with pre-segmented lines. Dataset used contained approximately 5000 pages for each language which were then divided into training, validation and testing set. The proposed fused network i.e. CRNN, does the task of feature extraction and sequence labeling as a single unit. Wherein, Convolutional Neural Network (CNN) is pitched in especially to eliminate dependency of hand crafted features. It directly processes 2D image (binarized) of each line and extracts features based on the learning kernels. Then, Recurrent Neural Network (RNN) with BLSTM (bi directional LSTM) architecture is applied on learned features to produce inference/ sequence probabilities. The output of CTC layer transcripts these probabilities into Unicode range of the evaluated language and one blank label. These layers and their parameters are empirically selected and kept same for all the languages.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call