A binary-tree-based OCR technique for machine-printed characters

B Gatos,N Papamarkos,C Chamzas

doi:10.1016/s0952-1976(97)00013-4

Abstract

This paper describes the structure of an optical character recognition (OCR) system for printed documents. This system is trained for Latin and Greek typewritten text, but it can be easily adapted to any typewritten character set. The proposed method is divided into two main stages. In the first stage suitable binary features are extracted, most of which are independent of the scaling and rotation of the characters. After that, a binary tree classification technique is used, and an optimal tree classifier is constructed. In the second stage, the characters at the end-nodes of the binary tree are classified by using a new template-matching technique. By setting a suitable threshold for the matching, a decision can be reached for the greatest part of the characters. For those characters that the binary tree cannot recognize with great confidence, a secondary minimum distance, classifier trained with the Zernike moments of the characters, is used. Experimental results show that the performance of the proposed OCR system is high, and the recognition rate can exceed 99.5%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A binary-tree-based OCR technique for machine-printed characters

Abstract

Talk to us

Similar Papers

More From: Engineering Applications of Artificial Intelligence

Lead the way for us

Journal: Engineering Applications of Artificial Intelligence	Publication Date: Aug 1, 1997
Citations: 9

Similar Papers

JPEG for Arabic Handwritten Character Recognition: Add a Dimension of Application
Abdurazzag Ali ... Salem Ali
-
Abdurazzag Ali, et. al.Abdurazzag Ali ... Salem Ali
01 Oct 2008
01 Oct 2008

Soft Computing Techniques for Optical Character Recognition Systems
Arindam Chaudhuri ... Pratixa Badelia
-
Arindam Chaudhuri, et. al.Arindam Chaudhuri ... Pratixa Badelia
24 Dec 2016
24 Dec 2016

OmniPage vs. Sakhr: paired model evaluation of two Arabic OCR products
Tapas Kanungo ... Daniel P Lopresti
-
Tapas Kanungo, et. al.Tapas Kanungo ... Daniel P Lopresti
07 Jan 1999
07 Jan 1999

Optical Character Recognition System for Nastalique Urdu-Like Script Languages Using Supervised Learning
S S R Rizvi ... A Muhammad
International Journal of Pattern Recognition and Artificial Intelligence | VOL. 33
S S R Rizvi, et. al.S S R Rizvi ... A Muhammad
01 Sep 2019
International Journal of Pattern Recognition and Artificial Intelligence | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A binary-tree-based OCR technique for machine-printed characters

Abstract

Talk to us

Similar Papers

More From: Engineering Applications of Artificial Intelligence