Lightweight single pass numerical reading extraction for displays in the wild

Shanmukha Yenneti,Yan-Ming Chiou,Bob Price

doi:10.2352/ei.2023.35.7.image-282

Shanmukha Yenneti, Yan-Ming Chiou + Show 1 more

https://doi.org/10.2352/ei.2023.35.7.image-282

Copy DOI

Export

Save

Cite

Journal: Electronic Imaging

Publication Date: Jan 16, 2023

Abstract
Full-Text
Similar Papers

Abstract

Listen

Although considerable progress has been made in recognizing multi-character text from images, there are still cases where there is a lack of robust computationally-efficient methods that can execute on portable devices to read device displays in the wild. We specifically address the problem of parsing digits from 7 segment displays. Recognizing these displays is important for many tasks such as assisting users with tasks using augmented reality agents that need to verify actions or connecting legacy devices to the internet for process control using cheap cameras. Legacy techniques based on image processing operators and OCR are brittle whereas massive deep networks are too computationally expensive. We describe a computationally tractable VGG style backbone combined with a novel digit inference head that can be trained using a synthetic display generator with novel augmentations. We show the model trained on augmented synthetic data generalizes well to a corpus of real-world display images getting 97.8% single-frame accuracy and obtaining a throughput of 30 frames per second. We describe how the output can be further stabilized to improve accuracy through a kind of mode filtering.

Full Text