Abstract

In this paper, we introduce an “on the device” text line recognition framework that is designed for mobile or embedded systems. We consider per-character segmentation as a language-independent problem and individual character recognition as a language-dependent one. Thus, the proposed solution is based on two separate artificial neural networks (ANN) and dynamic programming instead of employing image processing methods for the segmentation step or end-to-end ANN. To satisfy the tight constraints on memory size imposed by embedded systems and to avoid overfitting, we employ ANNs with a small number of trainable parameters. The primary purpose of our framework is the recognition of low-quality images of identity documents with complex backgrounds and a variety of languages and fonts. We demonstrate that our solution shows high recognition accuracy on natural datasets even being trained on purely synthetic data. We use MIDV-500 and Census 1961 Project datasets for text line recognition. The proposed method considerably surpasses the algorithmic method implemented in Tesseract 3.05, the LSTM method (Tesseract 4.00), and unpublished method used in the ABBYY FineReader 15 system. Also, our framework is faster than other compared solutions. We show the language-independence of our segmenter with the experiment with Cyrillic, Armenian, and Chinese text lines.

Highlights

  • Smartphones, tablet computers, and other mobile devices gain more and more popularity each day

  • We provide the human performance error rate – ≈0.20% – to emphasize that the majority of the classifiers presented in Table 4 are almost equal to human recognition ability. We claim that this result on the MNIST database proves the applicability of such a light-weight artificial neural networks (ANN) to the optical character recognition (OCR) problem

  • In this paper, we present our method for text line recognition that employs two ANNs interconnected by the dynamic programming algorithm

Read more

Summary

Introduction

Smartphones, tablet computers, and other mobile devices gain more and more popularity each day. Applications for such devices include government and commercial services that often require entering data from printed documents. Several solutions appeared in recent years [3]–[6] for optical text recognition in images that are captured using mobile devices. These systems can be classified into two groups: client-server solutions, which transfer images to a ‘‘cloud’’ and require internet connection, and ‘‘on the device’’ methods that perform the recognition process without data transmission.

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.