Abstract

The rapid distribution of digital cameras has caused several new problems related to text recognition. Based on experimental studies, it was revealed that existing OCR systems cannot cope with a complex perspective and geometric distortions that arise during photographing text documents. Therefore, it is necessary to apply text documents pre-processing, so the text lines were straight and horizontal. This article briefly considers the methods document pre-processing, and found that it depends on the type of distortion and is not universal. We proposed information technology and a new method involving the mathematical raising of straightened text lines on the image and heterogeneous distortion correction based on a page surface transformation model. This technology is more reliable than others as it is universal and corrects any type of geometric distortion, including a combination of several types of distortion.KeywordsInformation technologyRecognitionOptical recognition systemText document imageImage distortion

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call