Meta‐feature based few‐shot Siamese learning for Urdu optical character recognition

Asma Naseer,Kashif Zafar

doi:10.1111/coin.12530

Abstract

AbstractStandard convolution neural network (CNN) achieves high level of accuracy for the recognition of characters in different languages. However, like other deep neural networks, training of CNN requires a substantial amount of data. Lack of sufficient training data invokes dataset bias, during learning process, which leads to a decay in the performance of CNN. The limitation of training data can be addressed by using few‐shot learners. In this research, CNN‐based few‐shot Siamese learner is trained on meta‐features, extracted from Urdu text images using a novel graph‐based normal to tangent line (GNTL) technique, for Urdu optical character recognition (OCR) across different font sizes. The learner is trained on three corpora (datasets) including one benchmark corpus “Centre for Language Engineering Text Images” and two other corpora, that is, “Urdu Thickness Graphs” (UTG) and “Urdu OCR Font 16 to 36” (UOF) which are developed and released in this research. 80% of data is used for training while 20% of data is used for testing. To create UTG corpus, the proposed novel feature extraction technique GNTL is used and a meta‐features‐based corpus is developed in form of thickness graphs. The third corpus UOF is based on five different font sizes, that is, 16, 20, 26, 30, and 36. The performance of few‐shot Siamese learner is compared with a standard CNN, trained on the same three corpora. Meta‐feature based few‐shot Siamese learner achieves a promising recognition accuracy and outperforms standard CNN by around 3%. On average, the performance of few‐shot Siamese learner is 96.82% while standard CNN reveals an average performance of 93.96%.

Full Text