A SCRIPT-INDEPENDENT METHODOLOGY FOR OPTICAL CHARACTER RECOGNITION

John Makhoul,Richard Schwartz,Christopher Lapre,Issam Bazzi

doi:10.1016/s0031-3203(97)00152-0

A SCRIPT-INDEPENDENT METHODOLOGY FOR OPTICAL CHARACTER RECOGNITION

John Makhoul, Richard Schwartz + Show 2 more

https://doi.org/10.1016/s0031-3203(97)00152-0

Copy DOI

Journal: Pattern Recognition	Publication Date: Sep 1, 1998
Citations: 45

Affiliation: RNET Technologies (United States)

#Continuous Speech Recognition #Hidden Markov Models + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

We present a methodology for OCR that exhibits the following properties: script-independent feature extraction, training, and recognition components; no separate segmentation at the character and word levels; and the training is performed automatically on data that is also not presegmented. The methodology is adapted to OCR from continuous speech recognition, which has developed a mature and successful technology based on Hidden Markov Models. The script independence of the methodology is demonstrated using omnifont experiments on the DARPA Arabic OCR Corpus and the University of Washington English Document Image Database I.

Full Text