Optical Character Recognition System for Nastalique Urdu-Like Script Languages Using Supervised Learning

S S R Rizvi,A Sagheer,A Muhammad,K Adnan

doi:10.1142/s0218001419530045

Abstract

There are two main techniques to convert written or printed text into digital format. The first technique is to create an image of written/printed text, but images are large in size so they require huge memory space to store, as well as text in image form cannot be undergo further processes like edit, search, copy, etc. The second technique is to use an Optical Character Recognition (OCR) system. OCR’s can read documents and convert manual text documents into digital text and this digital text can be processed to extract knowledge. A huge amount of Urdu language’s data is available in handwritten or in printed form that needs to be converted into digital format for knowledge acquisition. Highly cursive, complex structure, bi-directionality, and compound in nature, etc. make the Urdu language too complex to obtain accurate OCR results. In this study, supervised learning-based OCR system is proposed for Nastalique Urdu language. The proposed system evaluations under a variety of experimental settings apprehend 98.4% training results and 97.3% test results, which is the highest recognition rate ever achieved by any Urdu language OCR system. The proposed system is simple to implement especially in software front of OCR system also the proposed technique is useful for printed text as well as handwritten text and it will help in developing more accurate Urdu OCR’s software systems in the future.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Optical Character Recognition System for Nastalique Urdu-Like Script Languages Using Supervised Learning

Abstract

Talk to us

Similar Papers

More From: International Journal of Pattern Recognition and Artificial Intelligence

Lead the way for us

Journal: International Journal of Pattern Recognition and Artificial Intelligence	Publication Date: Sep 1, 2019
Citations: 17

Similar Papers

Soft Computing Techniques for Optical Character Recognition Systems
Arindam Chaudhuri ... Pratixa Badelia
-
Arindam Chaudhuri, et. al.Arindam Chaudhuri ... Pratixa Badelia
24 Dec 2016
24 Dec 2016

OmniPage vs. Sakhr: paired model evaluation of two Arabic OCR products
Tapas Kanungo ... Jiangying Zhou
-
Tapas Kanungo, et. al.Tapas Kanungo ... Jiangying Zhou
07 Jan 1999
07 Jan 1999

JPEG for Arabic Handwritten Character Recognition: Add a Dimension of Application
Abdurazzag Ali ... Salem Ali
-
Abdurazzag Ali, et. al.Abdurazzag Ali ... Salem Ali
01 Oct 2008
01 Oct 2008

Prediction of OCR accuracy using simple image features
L.R Blando ... J Kanai
-
L.R Blando, et. al.L.R Blando ... J Kanai
14 Aug 1995
14 Aug 1995

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Optical Character Recognition System for Nastalique Urdu-Like Script Languages Using Supervised Learning

Abstract

Talk to us

Similar Papers

More From: International Journal of Pattern Recognition and Artificial Intelligence