Discrimination Of English To Other Indian Languages (Kannada And Hindi) For Ocr System

Ankit Kumar

doi:10.5121/ijcsea.2012.2214

Abstract

India is a multilingual multi-script country. In every state of India there are two languages one is state local language and the other is English. For example in Andhra Pradesh, a state in India, the document may contain text words in English and Telugu script. For Optical Character Recognition (OCR) of such a bilingual document, it is necessary to identify the script before feeding the text words to the OCRs of individual scripts. In this paper, we are introducing a simple and efficient technique of script identification for Kannada, English and Hindi text words of a printed document. The proposed approach is based on the horizontal and vertical projection profile for the discrimination of the three scripts. The feature extraction is done based on the horizontal projection profile of each text words. We analysed 700 different words of Kannada, English and Hindi in order to extract the discrimination features and for the development of knowledge base. We use the horizontal projection profile of each text word and based on the horizontal projection profile we extract the appropriate features. The proposed system is tested on 100 different document images containing more than 1000 text words of each script and a classification rate of 98.25%, 99.25% and 98.87% is achieved for Kannada, English and Hindi respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Computer Science, Engineering and Applications	Publication Date: Apr 30, 2012
Citations: 3	License type: cc-by

R Discovery Prime

R Discovery Prime

Discrimination Of English To Other Indian Languages (Kannada And Hindi) For Ocr System

Abstract

Talk to us

Similar Papers

More From: International Journal of Computer Science, Engineering and Applications

Lead the way for us

Similar Papers

Text line script identification for a tri-lingual document
Prakash K Aithal ... N V Krishnamoorthi M Subbareddy
-
Prakash K Aithal, et. al.Prakash K Aithal ... N V Krishnamoorthi M Subbareddy
01 Jul 2010
01 Jul 2010

Script Identification for a Tri-lingual Document
Prakash K Aithal ... N V Subbareddy
-
Prakash K Aithal, et. al.Prakash K Aithal ... N V Subbareddy
01 Jan 2010
01 Jan 2010

Analysis of Segmentation Methods for Brahmi Script
Ajay Pratap Singh ... Ashwin Kumar Kushwaha
DESIDOC Journal of Library & Information Technology | VOL. 39
Ajay Pratap Singh, et. al.Ajay Pratap Singh ... Ashwin Kumar Kushwaha
11 Mar 2019
DESIDOC Journal of Library & Information Technology | VOL. 39

Segmentation and Recognition of Handwritten Kannada Text Using Relevance Feedback and Histogram of Oriented Gradients – A Novel Approach
Karthik S ... Srikanta Murthy
International Journal of Advanced Computer Science and Applications | VOL. 7
Karthik S, et. al.Karthik S ... Srikanta Murthy
01 Jan 2015
International Journal of Advanced Computer Science and Applications | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Discrimination Of English To Other Indian Languages (Kannada And Hindi) For Ocr System

Abstract

Talk to us

Similar Papers

More From: International Journal of Computer Science, Engineering and Applications