Urdu-Text Detection and Recognition in Natural Scene Images Using Deep Learning

Syed Yasser Arafat,Muhammad Javed Iqbal

doi:10.1109/access.2020.2994214

Abstract

Urdu text is a cursive script and belongs to a non-Latin family of other cursive scripts like Arabic, Chinese, and Hindi. Urdu text poses a challenge for detection/localization from natural scene images, and consequently recognition of individual ligatures in scene images. In this paper, a methodology is proposed that covers detection, orientation prediction, and recognition of Urdu ligatures in outdoor images. As a first step, the custom FasterRCNN algorithm has been used in conjunction with well-known CNNs like Squeezenet, Googlenet, Resnet18, and Resnet50 for detection and localization purposes for images of size $320\times 240$ pixels. For ligature Orientation prediction, a custom Regression Residual Neural Network (RRNN) is trained/tested on datasets containing randomly oriented ligatures. Recognition of ligatures was done using Two Stream Deep Neural Network (TSDNN). In our experiments, five-set of datasets, containing 4.2K and 51K Urdu-text-embedded synthetic images were generated using the CLE annotation text to evaluate different tasks of detection, orientation prediction, and recognition of ligatures. These synthetic images contain 132, and 1600 unique ligatures corresponding to 4.2K and 51K images respectively, with 32 variations of each ligature (4-backgrounds and font 8-color variations). Also, 1094 real-world images containing more than 12k Urdu characters were used for TSDNN’s evaluation. Finally, all four detectors were evaluated and used to compare them for their ability to detect/localize Urdu-text using average-precision (AP). Resnet50 features based FasterRCNN was found to be the winner detector with AP of.98. While Squeeznet, Googlenet, Resnet18 based detectors had testing AP of.65,.88, and.87 respectively. RRNN achieved and accuracy of 79% and 99% for 4k and 51K images respectively. Similarly, for characters classification in ligatures, TSDNN attained a partial sequence recognition rate of 94.90% and 95.20% for 4k and 51K images respectively. Similarly, a partial sequence recognition rate of 76.60% attained for real world-images.

Highlights

As autonomous vehicles and other intelligent devices/CellPhones/mobile devices [1]/Robots [2] are coming online and they need to understand the environment in which they are operating
DISCUSSION we presented a novel methodology for Urdu text, covering the entire spectrum of both text detection and recognition as well as orientation prediction
For determining the best Convolutional Neural Networks (CNNs) for Urdu text detection, the first set of 4.2K synthetic dataset images were taken as an input to CNNs. 4-different CNNs are used as feature extractors to train four FasterRCNN models

Summary

INTRODUCTION

As autonomous vehicles and other intelligent devices/CellPhones/mobile devices [1]/Robots [2] are coming online and they need to understand the environment in which they are operating. The availability of outdoor Urdu text datasets facilitates in evaluating different kinds of learning models for judging their effectiveness for text detection, Orientation prediction, and ligature recognition is needed. J. Iqbal: Urdu-Text Detection and Recognition in Natural Scene Images Using Deep Learning. J. Iqbal: Urdu-Text Detection and Recognition in Natural Scene Images Using Deep Learning TABLE 1. This paper performs a comprehensive analysis of CNN features for determining their ability and effectiveness for Urdu text detection in outdoor pictures at the ligature level, the orientations of ligatures, and the recognition of individual characters in a text. The first study in Urdu text/script detection literature, which has used four different kinds of CNNs features for detection purposes. TSDNN based recognition of Urdu Ligatures using Resnet, googlenet features and BLSTM for both synthetic and real outdoor images.

PROPOSED METHODOLOGY

Findings

CONCLUSION AND FUTURE DIRECTIONS

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 89	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Urdu-Text Detection and Recognition in Natural Scene Images Using Deep Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Urdu ligature recognition techniques-A review
Gurvir Kaur ... Ajit Kumar
-
Gurvir Kaur, et. al.Gurvir Kaur ... Ajit Kumar
01 Dec 2017
01 Dec 2017

Cursive-Text: A Comprehensive Dataset for End-to-End Urdu Text Recognition in Natural Scene Images.
Asghar Ali Chandio ... Mehwish Leghari
Data in Brief | VOL. 31
Asghar Ali Chandio, et. al.Asghar Ali Chandio ... Mehwish Leghari
21 May 2020
Data in Brief | VOL. 31

Two-Layer Gaussian Process Regression With Example Selection for Image Dehazing
Xin Fan ... Zhongxuan Luo
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 27
Xin Fan, et. al.Xin Fan ... Zhongxuan Luo
01 Dec 2017
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 27

Urdu text in natural scene images: a new dataset and preliminary text detection.
Hazrat Ali ... Ghulam Mujtaba
PeerJ Computer Science | VOL. 7
Hazrat Ali, et. al.Hazrat Ali ... Ghulam Mujtaba
16 Sep 2021
PeerJ Computer Science | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Urdu-Text Detection and Recognition in Natural Scene Images Using Deep Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access