Weighted combination of per-frame recognition results for text recognition in a video stream

O Petrova,V.L Arlazarov,V.V Arlazarov,K Bulatov

doi:10.18287/2412-6179-co-795

Abstract

The scope of uses of automated document recognition has extended and as a result, recognition techniques that do not require specialized equipment have become more relevant. Among such techniques, document recognition using mobile devices is of interest. However, it is not always possible to ensure controlled capturing conditions and, consequentially, high quality of input images. Unlike specialized scanners, mobile cameras allow using a video stream as an input, thus obtaining several images of the recognized object, captured with various characteristics. In this case, a problem of combining the information from multiple input frames arises. In this paper, we propose a weighing model for the process of combining the per-frame recognition results, two approaches to the weighted combination of the text recognition results, and two weighing criteria. The effectiveness of the proposed approaches is tested using datasets of identity documents captured with a mobile device camera in different conditions, including perspective distortion of the document image and low lighting conditions. The experimental results show that the weighting combination can improve the text recognition result quality in the video stream, and the per-character weighting method with input image focus estimation as a base criterion allows one to achieve the best results on the datasets analyzed.

Highlights

IntroductionDocument recognition in uncontrolled conditionsNowadays text object recognition is widely used in government and business processes and in everyday life [1, 2]
Document recognition in uncontrolled conditionsNowadays text object recognition is widely used in government and business processes and in everyday life [1, 2]
Due to the fact that the input frames obtained using a mobile device camera in uncontrolled conditions may not be of very high quality, the best combination result is obtained using the strategy of combining 50% of the highest scoring frames

Summary

Introduction

Document recognition in uncontrolled conditionsNowadays text object recognition is widely used in government and business processes and in everyday life [1, 2]. One of the first problems in which optical character recognition (OCR) technologies found their application was automatic data entry. Today the scope of application of such technologies has expanded, and document recognition is increasingly carried out in uncontrolled capturing conditions. Apart from the automatic input of personal data, text object recognition is essential in electronic document management systems, allows saving time, reducing expenses, and saving natural resources [4]. The development of hardware, such as personal mobile devices, has made it possible to expand the applicability of OCR technologies for recognizing text in natural scenes and use these technologies in such cases as driver assistance systems [5], assistance for people with visual impairments [6], online translators [7], government photo and video recording systems [8, 9], and many more. More and more cases require the possibility to use “improvised means” for the recognition, with input images captured using a smartphone camera or a web-camera [11, 12]

Objectives

Findings

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computer Optics	Publication Date: Feb 1, 2021
Citations: 11	License type: cc-by

R Discovery Prime

R Discovery Prime

Weighted combination of per-frame recognition results for text recognition in a video stream

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computer Optics

Lead the way for us

Similar Papers

Choosing the best image of the document owner’s photograph in the video stream on the mobile device
Dmitry Polevoy ... Mikhail Aliev
-
Dmitry Polevoy, et. al.Dmitry Polevoy ... Mikhail Aliev
04 Jan 2021
04 Jan 2021

Methods of weighted combination for text field recognition in a video stream
Olga Petrova ... Dmitry P Nikolaev
-
Olga Petrova, et. al.Olga Petrova ... Dmitry P Nikolaev
31 Jan 2020
31 Jan 2020

Application of Binary Image Quality Assessment Methods to Predict the Quality of Optical Character Recognition Results
Mateusz Kopytek ... Krzysztof Okarma
Applied Sciences | VOL. 14
Mateusz Kopytek, et. al.Mateusz Kopytek ... Krzysztof Okarma
08 Nov 2024
Applied Sciences | VOL. 14

Mobile augmented reality in the data center
S Deffeyes
IBM Journal of Research and Development | VOL. 55
S DeffeyesS Deffeyes
01 Sep 2011
IBM Journal of Research and Development | VOL. 55

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Weighted combination of per-frame recognition results for text recognition in a video stream

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computer Optics