SEDIQA: Sound Emitting Document Image Quality Assessment in a Reading Aid for the Visually Impaired.

Jane Courtney

doi:10.3390/jimaging7090168

Abstract

For visually impaired people (VIPs), the ability to convert text to sound can mean a new level of independence or the simple joy of a good book. With significant advances in optical character recognition (OCR) in recent years, a number of reading aids are appearing on the market. These reading aids convert images captured by a camera to text which can then be read aloud. However, all of these reading aids suffer from a key issue—the user must be able to visually target the text and capture an image of sufficient quality for the OCR algorithm to function—no small task for VIPs. In this work, a sound-emitting document image quality assessment metric (SEDIQA) is proposed which allows the user to hear the quality of the text image and automatically captures the best image for OCR accuracy. This work also includes testing of OCR performance against image degradations, to identify the most significant contributors to accuracy reduction. The proposed no-reference image quality assessor (NR-IQA) is validated alongside established NR-IQAs and this work includes insights into the performance of these NR-IQAs on document images. SEDIQA is found to consistently select the best image for OCR accuracy. The full system includes a document image enhancement technique which introduces improvements in OCR accuracy with an average increase of 22% and a maximum increase of 68%.

Highlights

With advances in smartphone technology, in camera quality, several visual aids for VIPs are emerging [1,2] with Microsoft’s Seeing AI as the current market frontrunner
The Q-metric was validated on these synthetic images and its performance with respect to image degradations was compared with established no-reference image quality assessor (NR-image quality assessment (IQA)) as well as with optical character recognition (OCR) accuracy
The full sound-emitting document image quality assessment metric (SEDIQA) system was tested on this dataset and in live capture to confirm the relationship between OCR accuracy and the

Summary

Introduction

With advances in smartphone technology, in camera quality, several visual aids for VIPs are emerging [1,2] with Microsoft’s Seeing AI as the current market frontrunner. This last task has embedded in it reliance on OCR accuracy and, on image quality This means that the user’s performance (hand motion, visual acuity, etc.) will affect the performance of the reader. Since these readers are both hand-held and designed for people with visual impairments, this is a fundamental issue that needs to be addressed. To solve this issue, automatic processing can be done to improve OCR performance [6,7], but even the best performing pre-processors cannot achieve high OCR accuracy out of a low-quality image. It is necessary to assess the image quality before attempting OCR, and so a robust image quality assessment (IQA) metric is needed

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Imaging	Publication Date: Aug 30, 2021
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

SEDIQA: Sound Emitting Document Image Quality Assessment in a Reading Aid for the Visually Impaired.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Imaging

Lead the way for us

Similar Papers

Document Image Quality Assessment: A Survey
Alireza Alaei ... Vinh Bui
ACM Computing Surveys | VOL. 56
Alireza Alaei, et. al.Alireza Alaei ... Vinh Bui
14 Sep 2023
ACM Computing Surveys | VOL. 56

Towards Document Image Quality Assessment: A Text Line Based Framework and a Synthetic Text Line Image Dataset
Hongyu Li ... Fan Zhu
-
Hongyu Li, et. al.Hongyu Li ... Fan Zhu
01 Sep 2019
01 Sep 2019

Document Quality Estimation Using Spatial Frequency Response
Pranjal Kumar Rai ... Sajal Maheshwari
-
Pranjal Kumar Rai, et. al.Pranjal Kumar Rai ... Sajal Maheshwari
01 Apr 2018
01 Apr 2018

Building Super-Resolution Image Generator for OCR Accuracy Improvement
Xujun Peng ... Chao Wang
-
Xujun Peng, et. al.Xujun Peng ... Chao Wang
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SEDIQA: Sound Emitting Document Image Quality Assessment in a Reading Aid for the Visually Impaired.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Imaging