Abstract

Many formal institutions, companies, hospitals, laboratories need some time to exchange hand signed reports through modern communication means such as Fax, E-mails, and others. A problem is faced due to the quality of both scanned documents and originally used paper, which results in problems in converting such images to text. In addition, font type and size, contrast and background darkness have an adverse effect on the accuracy of the resulted text. Thus, an investigation into the relationship between scanned document zoom and scanning resolution in Dots per Inch (DPI) for a special case and type of scanned forms is carried out to enable design of an algorithm that takes into account such cases. It is found that a much higher level of zooming and resolution is needed to achieve acceptable recognition for the special case of dark, low contrast, small font forms. It is also found that the optimum zooming level is set by the number of recognized words as they are more difficult to learn and analyze.

Highlights

  • The goal of Optical Character Recognition (OCR) is to classify optical patterns corresponding to alphanumeric or other characters

  • A real case of formal, dark background, low contrast forms exchanged as images through scanned e-mails and fax machines are collected and re-scanned again at various levels of resolution and zoom values

  • The resulting recognized files are produced as text documents with statistical analysis regarding the correctly recognized numbers and words versus resolution and zooming levels

Read more

Summary

Introduction

The goal of Optical Character Recognition (OCR) is to classify optical patterns (often contained in a digital image) corresponding to alphanumeric or other characters. The process of OCR involves several steps including segmentation, feature extraction, and classification. Some applications of OCR range from people wish to scan. How to cite this paper: Iskandarani, M.Z. (2015) Improving the OCR of Low Contrast, Small Fonts, Dark Background Forms. Using Correlated Zoom and Resolution Technique (CZRT). Journal of Data Analysis and Information Processing, 3, 34-42. Iskandarani in a document and have the text of that document available in a word processor, to recognition of license plate numbers and zip codes [1]-[5]

Methods
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call