Urdu Nastaliq recognition using convolutional–recursive deep learning

Saeeda Naz,Arif I Umar,Riaz Ahmad,Imran Siddiqi,Saad B Ahmed,Muhammad I Razzak,Faisal Shafait

doi:10.1016/j.neucom.2017.02.081

Abstract

Recent developments in recognition of cursive scripts rely on implicit feature extraction methods that provide better results as compared to traditional hand-crafted feature extraction approaches. We present a hybrid approach based on explicit feature extraction by combining convolutional and recursive neural networks for feature learning and classification of cursive Urdu Nastaliq script. The first layer extracts low-level translational invariant features using Convolutional Neural Networks (CNN) which are then forwarded to Multi-dimensional Long Short-Term Memory Neural Networks (MDLSTM) for contextual feature extraction and learning. Experiments are carried out on the publicly available Urdu Printed Text-line Image (UPTI) dataset using the proposed hierarchical combination of CNN and MDLSTM. A recognition rate of up to 98.12% for 44-classes is achieved outperforming the state-of-the-art results on the UPTI dataset.

Full Text