Improving the Performance of Automatic Lip-Reading Using Image Conversion Techniques

Ki-Seung Lee

doi:10.3390/electronics13061032

Abstract

Variation in lighting conditions is a major cause of performance degradation in pattern recognition when using optical imaging. In this study, infrared (IR) and depth images were considered as possible robust alternatives against variations in illumination, particularly for improving the performance of automatic lip-reading. The variations due to lighting conditions were quantitatively analyzed for optical, IR, and depth images. Then, deep neural network (DNN)-based lip-reading rules were built for each image modality. Speech recognition techniques based on IR or depth imaging required an additional light source that emitted light in the IR range, along with a special camera. To mitigate this problem, we propose a method that does not use an IR/depth image directly, but instead estimates images based on the optical RGB image. To this end, a modified U-net was adopted to estimate the IR/depth image from an optical RGB image. The results show that the IR and depth images were rarely affected by the lighting conditions. The recognition rates for the optical, IR, and depth images were 48.29%, 95.76%, and 92.34%, respectively, under various lighting conditions. Using the estimated IR and depth images, the recognition rates were 89.35% and 80.42%, respectively. This was significantly higher than for the optical RGB images.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improving the Performance of Automatic Lip-Reading Using Image Conversion Techniques

Abstract

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Journal: Electronics	Publication Date: Mar 9, 2024
License type: CC BY 4.0

Similar Papers

A Novel Infrared Image Enhancement Based on Correlation Measurement of Visible Image for Urban Traffic Surveillance Systems
Jingyue Chen ... Wei Wu
Journal of Intelligent Transportation Systems | VOL. 24
Jingyue Chen, et. al.Jingyue Chen ... Wei Wu
01 Aug 2019
Journal of Intelligent Transportation Systems | VOL. 24

Super-resolution reconstruction of infrared image based on channel attention and transfer learning
...
Opto-electronic Engineering | VOL. 48
, et. al. ...
15 Jan 2021
Opto-electronic Engineering | VOL. 48

Physics-based simulation of narrow and wide band gap photonic devices

-

01 Jan 2015
01 Jan 2015

Multi-source Airborne IR and Optical Image Fusion and Its Application to Target Detection
Fenghui Yao ... Ali Sekmen
-
Fenghui Yao, et. al.Fenghui Yao ... Ali Sekmen
01 Jan 2008
01 Jan 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving the Performance of Automatic Lip-Reading Using Image Conversion Techniques

Abstract

Talk to us

Similar Papers

More From: Electronics