Cursive Character Recognition in Natural Scene Images Using a Multilevel Convolutional Neural Network Fusion

Asghar Ali Chandio,Md Asikuzzaman,Mark R Pickering

doi:10.1109/access.2020.3001605

Asghar Ali Chandio, Md Asikuzzaman + Show 1 more

Open Access

https://doi.org/10.1109/access.2020.3001605

Copy DOI

Abstract

The accuracy of current natural scene text recognition algorithms is limited by the poor performance of character recognition methods for these images. The complex backgrounds, variations in the writing, text size, orientations, low resolution and multi-language text make recognition of text in natural images a complex and challenging task. Conventional machine learning and deep learning-based methods have been developed that have achieved satisfactory results, but character recognition for cursive text such as Arabic and Urdu scripts in natural images is still an open research problem. The characters in the cursive text are connected and are difficult to segment for recognition. Variations in the shape of a character due to its different positions within a word make the recognition task more challenging than non-cursive text. Optical character recognition (OCR) techniques proposed for Arabic and Urdu scanned documents perform very poorly when applied to character recognition in natural images. In this paper, we propose a multi-scale feature aggregation (MSFA) and a multi-level feature fusion (MLFF) network architecture to recognize isolated Urdu characters in natural images. The network first aggregates multi-scale features of the convolutional layers by up-sampling and addition operations and then combines them with the high-level features. Finally, the outputs of the MSFA and MLFF networks are fused together to create more robust and powerful features. A comprehensive dataset of segmented Urdu characters is developed for the evaluation of the proposed network models. Synthetic text on the patches of images with real natural scene backgrounds is generated to increase the samples of infrequently used characters. The proposed model is evaluated on the Chars74K and ICDAR03 datasets. To validate the proposed model on the new Urdu character image dataset, we compare its performance with the histogram of oriented gradients (HoG) method. The experimental results show that the aggregation of multi-scale and multilevel features and their fusion is more effective, and outperforms other methods on the Urdu character image and Chars74K datasets.

Highlights

Rapid developments in camera-based portable devices have facilitated the acquisition of a large number of images every day
To handle the challenging problem of Urdu text recognition in natural scene images, we propose a new convolutional neural network (CNN) architecture that integrates convolutional features of the network at different layers and combines them with the high-level layers to create a fused feature
1) ENGLISH NATURAL SCENE CHARACTER DATASET To analyze the quality of the proposed method, we evaluated our method on the Chars74K [33] and ICDAR03 [34] datasets, and compared its performance in terms of F-score with a number of state-of-the-art character recognition methods

Summary

INTRODUCTION

Rapid developments in camera-based portable devices have facilitated the acquisition of a large number of images every day. The ICDAR has published a multi-language natural scene image dataset that includes Arabic and eight other languages [14], whereas the datasets, techniques, evaluation protocols and the results achieved for Chinese text detection and end-to-end recognition are reported in [13] In these ICDAR robust reading competitions, the problem of text extraction is generally divided into four sub-tasks: (i) text detection, (ii) isolated character recognition, (iii) cropped word recognition and (iv) end-toend text recognition. In this research study, a new dataset is created which contains images of isolated characters that are manually segmented from natural scene images containing Urdu text Before passing this dataset to the CNN classifier for classification and recognition, preprocessing operations are performed to give the dataset a uniform and standard representation.

RELATED WORK

METHODOLOGY

HISTOGRAM OF ORIENTED GRADIENTS

EVALUATION METRICS

CLASSIFICATION RESULTS ON URDU CHARACTER DATASET

Findings

CONCLUSION AND FUTURE WORK

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 21	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Cursive Character Recognition in Natural Scene Images Using a Multilevel Convolutional Neural Network Fusion

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Feature-Level Fusion using Convolutional Neural Network for Multi-Language Synthetic Character Recognition in Natual Images
Asghar Ali ... Mark Pickering
-
Asghar Ali, et. al.Asghar Ali ... Mark Pickering
01 Dec 2018
01 Dec 2018

Detecting of Vertically-Oriented Texts in Images Containing Natural Scenes
Yi Ling Ong ... Almon Chai
-
Yi Ling Ong, et. al.Yi Ling Ong ... Almon Chai
07 Dec 2020
07 Dec 2020

A Hybrid Deep Neural Network for Urdu Text Recognition in Natural Images
Asghar Ali ... Mark Pickering
-
Asghar Ali, et. al.Asghar Ali ... Mark Pickering
01 Jul 2019
01 Jul 2019

Graphic image classification method based on an attention mechanism and fusion of multilevel and multiscale deep features
Shan Liu ... Lingling Huang
Computer Communications | VOL. 209
Shan Liu, et. al.Shan Liu ... Lingling Huang
08 Jul 2023
Computer Communications | VOL. 209

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cursive Character Recognition in Natural Scene Images Using a Multilevel Convolutional Neural Network Fusion

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access